Bayesian GLM Part5

Author

Murray Logan

Published

07/07/2025

1 Preparations

Load the necessary libraries

library(tidyverse) # for data wrangling etc
library(rstanarm) # for fitting models in STAN
library(cmdstanr) # for cmdstan
library(brms) # for fitting models in STAN
library(coda) # for diagnostics
library(bayesplot) # for diagnostics
library(ggmcmc) # for MCMC diagnostics
library(DHARMa) # for residual diagnostics
library(rstan) # for interfacing with STAN
library(emmeans) # for marginal means etc
library(broom) # for tidying outputs
library(tidybayes) # for more tidying outputs
library(ggeffects) # for partial plots
library(broom.mixed) # for summarising models
library(ggeffects) # for partial effects plots
library(bayestestR) # for ROPE
library(see) # for some plots
library(easystats) # for the easystats ecosystem
library(ggridges) # for ridge plots
library(patchwork) # for multiple plots
library(modelsummary) # for data and model summaries
theme_set(theme_grey()) # put the default ggplot theme back
source("helperFunctions.R")

2 Scenario

Here is a modified example from Quinn and Keough (2002). Day and Quinn (1989) described an experiment that examined how rock surface type affected the recruitment of barnacles to a rocky shore. The experiment had a single factor, surface type, with 4 treatments or levels: algal species 1 (ALG1), algal species 2 (ALG2), naturally bare surfaces (NB) and artificially scraped bare surfaces (S). There were 5 replicate plots for each surface type and the response (dependent) variable was the number of newly recruited barnacles on each plot after 4 weeks.

Figure 1: Six-plated barnacle
Table 1: Format of day.csv data files
TREAT BARNACLE
ALG1 27
.. ..
ALG2 24
.. ..
NB 9
.. ..
S 12
.. ..
Table 2: Description of the variables in the day data file
TREAT Categorical listing of surface types. ALG1 = algal species 1, ALG2 = algal species 2, NB = naturally bare surface, S = scraped bare surface.
BARNACLE The number of newly recruited barnacles on each plot after 4 weeks.

3 Read in the data

day <- read_csv("../data/day.csv", trim_ws = TRUE)
Rows: 20 Columns: 2
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (1): TREAT
dbl (1): BARNACLE

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
glimpse(day)
Rows: 20
Columns: 2
$ TREAT    <chr> "ALG1", "ALG1", "ALG1", "ALG1", "ALG1", "ALG2", "ALG2", "ALG2…
$ BARNACLE <dbl> 27, 19, 18, 23, 25, 24, 33, 27, 26, 32, 9, 13, 17, 14, 22, 12…
## Explore the first 6 rows of the data
head(day)
# A tibble: 6 × 2
  TREAT BARNACLE
  <chr>    <dbl>
1 ALG1        27
2 ALG1        19
3 ALG1        18
4 ALG1        23
5 ALG1        25
6 ALG2        24
str(day)
spc_tbl_ [20 × 2] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
 $ TREAT   : chr [1:20] "ALG1" "ALG1" "ALG1" "ALG1" ...
 $ BARNACLE: num [1:20] 27 19 18 23 25 24 33 27 26 32 ...
 - attr(*, "spec")=
  .. cols(
  ..   TREAT = col_character(),
  ..   BARNACLE = col_double()
  .. )
 - attr(*, "problems")=<externalptr> 
day |> datawizard::data_codebook()
day (20 rows and 2 variables, 2 shown)

ID | Name     | Type      | Missings |  Values |         N
---+----------+-----------+----------+---------+----------
1  | TREAT    | character | 0 (0.0%) |    ALG1 | 5 (25.0%)
   |          |           |          |    ALG2 | 5 (25.0%)
   |          |           |          |      NB | 5 (25.0%)
   |          |           |          |       S | 5 (25.0%)
---+----------+-----------+----------+---------+----------
2  | BARNACLE | numeric   | 0 (0.0%) | [8, 33] |        20
----------------------------------------------------------
day |> modelsummary::datasummary_skim()
Unique Missing Pct. Mean SD Min Median Max Histogram
BARNACLE 19 0 19.8 7.4 8.0 19.5 33.0
TREAT N %
ALG1 5 25.0
ALG2 5 25.0
NB 5 25.0
S 5 25.0
day |> modelsummary::datasummary_skim(by = "TREAT")
TREAT Unique Missing Pct. Mean SD Min Median Max Histogram
BARNACLE ALG1 5 0 22.4 3.8 18.0 23.0 27.0
ALG2 5 0 28.4 3.9 24.0 27.0 33.0
NB 5 0 15.0 4.8 9.0 14.0 22.0
S 5 0 13.2 4.5 8.0 12.0 20.0
TREAT N %
ALG1 5 25.0
ALG2 5 25.0
NB 5 25.0
S 5 25.0

Start by declaring the categorical variables as factor.

day <- day |> mutate(TREAT = factor(TREAT))

Model formula: \[ \begin{align} y_i &\sim{} \mathcal{Pois}(\lambda_i)\\ ln(\mu_i) &= \boldsymbol{\beta} \bf{X_i}\\ \beta_0 &\sim{} \mathcal{N}(3.1, 1)\\ \beta_{1,2,3} &\sim{} \mathcal{N}(0,1)\\ \end{align} \]

where \(\boldsymbol{\beta}\) is a vector of effects parameters and \(\bf{X}\) is a model matrix representing the intercept and treatment contrasts for the effects of Treatment on barnacle recruitment.

4 Exploratory data analysis

The exploratory data analyses that we performed in the frequentist instalment of this example are equally valid here. That is, boxplots and/or violin plots for each population (substrate type).

day |> ggplot(aes(y = BARNACLE, x = TREAT)) +
  geom_boxplot() +
  geom_point(color = "red")

day |> ggplot(aes(y = BARNACLE, x = TREAT)) +
  geom_violin() +
  geom_point(color = "red")

Conclusions:

  • although exploratory data analysis suggests that we might well be fine modelling these data against a Gaussian distribution, a Poisson distribution would clearly be a more natural choice and it would also prevent any non positive predictions.

5 Fit the model

In rstanarm, the default priors are designed to be weakly informative. They are chosen to provide moderate regularisation (to help prevent over-fitting) and help stabilise the computations.

day.rstanarm <- stan_glm(BARNACLE ~ TREAT,
  data = day,
  family = poisson(link = "log"),
  iter = 5000, warmup = 1000,
  chains = 3, thin = 5, refresh = 0
)
prior_summary(day.rstanarm)
Priors for model 'day.rstanarm' 
------
Intercept (after predictors centered)
 ~ normal(location = 0, scale = 2.5)

Coefficients
  Specified prior:
    ~ normal(location = [0,0,0], scale = [2.5,2.5,2.5])
  Adjusted prior:
    ~ normal(location = [0,0,0], scale = [5.63,5.63,5.63])
------
See help('prior_summary.stanreg') for more details

This tells us:

  • for the intercept, when the family is Poisson, it is using a normal prior with a mean of 0 and a standard deviation of 2.5. The 2.5 is used for all intercepts. It is often scaled, but only if it is larger than 2.5 is the scaled version kept.

  • for the coefficients (in this case, just the slope), the default prior is a normal prior centred around 0 with a standard deviation of 2.5. This is then adjusted for the scale of the data by dividing the 2.5 by the standard deviation of the numerical dummy variables for the predictor (then rounded).

2.5 / sd(model.matrix(~TREAT, day)[, 2])
[1] 5.627314
  • there is no auxiliary prior as we are employing a Poisson distribution.

One way to assess the priors is to have the MCMC sampler sample purely from the prior predictive distribution without conditioning on the observed data. Doing so provides a glimpse at the range of predictions possible under the priors. On the one hand, wide ranging predictions would ensure that the priors are unlikely to influence the actual predictions once they are conditioned on the data. On the other hand, if they are too wide, the sampler is being permitted to traverse into regions of parameter space that are not logically possible in the context of the actual underlying ecological context. Not only could this mean that illogical parameter estimates are possible, when the sampler is traversing regions of parameter space that are not supported by the actual data, the sampler can become unstable and have difficulty.

We can draw from the prior predictive distribution instead of conditioning on the response, by updating the model and indicating prior_PD=TRUE. After refitting the model in this way, we can plot the predictions to gain insights into the range of predictions supported by the priors alone.

day.rstanarm1 <- update(day.rstanarm, prior_PD = TRUE)
day.rstanarm1 |>
  ggpredict() |>
  plot(show_data = TRUE)
Data points may overlap. Use the `jitter` argument to add some amount of
  random variation to the location of data points and avoid overplotting.

ggpredict(day.rstanarm1) |>
  plot(show_data = TRUE) |>
  wrap_plots() &
  scale_y_log10()
Data points may overlap. Use the `jitter` argument to add some amount of
  random variation to the location of data points and avoid overplotting.
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.

Conclusions:

  • we see that the range of predictions is fairly wide and the predicted means could range from 0 to very large (perhaps too large).

The following link provides some guidance about defining priors. [https://github.com/stan-dev/stan/wiki/Prior-Choice-Recommendations]

When defining our own priors, we typically do not want them to be scaled.

If we wanted to define our own priors that were less vague, yet still not likely to bias the outcomes, we could try the following priors (mainly plucked out of thin air):

  • \(\beta_0\): normal centred at 3 with a standard deviation of 10
    • mean of 3: since mean(log(day$BARNACLE))
    • sd of 1: since sd(log(day$BARNACLE))
  • \(\beta_1\): normal centred at 0 with a standard deviation of 6
    • sd of 1: since: sd(log(day$BARNACLE))/apply(model.matrix(~TREAT, data = day), 2, sd)

I will also overlay the raw data for comparison.

day.rstanarm2 <- stan_glm(BARNACLE ~ TREAT,
  data = day,
  family = poisson(link = "log"),
  prior_intercept = normal(3, 1, autoscale = FALSE),
  prior = normal(0, 1, autoscale = FALSE),
  prior_PD = TRUE,
  iter = 5000, warmup = 1000,
  chains = 3, thin = 5, refresh = 0
)
ggpredict(day.rstanarm2) |>
  plot(show_data = TRUE) +
  scale_y_log10()
Data points may overlap. Use the `jitter` argument to add some amount of
  random variation to the location of data points and avoid overplotting.
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.

Now lets refit, conditioning on the data.

day.rstanarm3 <- update(day.rstanarm2, prior_PD = FALSE)
posterior_vs_prior(day.rstanarm3,
  color_by = "vs", group_by = TRUE,
  facet_args = list(scales = "free_y")
)

Drawing from prior...

Conclusions:

  • in each case, the prior is substantially wider than the posterior, suggesting that the posterior is not biased towards the prior.
ggpredict(day.rstanarm3) |> plot(show_data = TRUE)
Data points may overlap. Use the `jitter` argument to add some amount of
  random variation to the location of data points and avoid overplotting.

In brms, the default priors are designed to be weakly informative. They are chosen to provide moderate regularisation (to help prevent over-fitting) and help stabilise the computations.

Unlike rstanarm, brms models must be compiled before they start sampling. For most models, the compilation of the stan code takes around 45 seconds.

day.form <- bf(BARNACLE ~ TREAT,
  family = poisson(link = "log")
)
day.brm <- brm(day.form,
  data = day,
  iter = 5000,
  warmup = 2500,
  chains = 3, cores = 3,
  thin = 5,
  refresh = 0,
  backend = "rstan"
)
Compiling Stan program...
Start sampling
day.brm |> prior_summary()
                prior     class      coef group resp dpar nlpar lb ub       source
               (flat)         b                                            default
               (flat)         b TREATALG2                             (vectorized)
               (flat)         b   TREATNB                             (vectorized)
               (flat)         b    TREATS                             (vectorized)
 student_t(3, 3, 2.5) Intercept                                            default

This tells us:

  • for the intercept, it is using a student t (flatter normal) prior with a mean of 0 and a standard deviation of 2.5. These mean and standard deviation values are the defaults.

  • for the beta coefficients (in this case, each effect), the default prior is a improper flat prior. A flat prior essentially means that any value between negative infinity and positive infinity are equally likely. Whilst this might seem reckless, in practice, it seems to work reasonably well for non-intercept beta parameters.

  • since we have nominated a Poisson distribution, there is no auxiliary prior.

day.brm1 |>
  ggpredict() |>
  plot(show_data = TRUE)
Data points may overlap. Use the `jitter` argument to add some amount of
  random variation to the location of data points and avoid overplotting.

day.brm1 |>
  ggpredict() |>
  plot(show_data = TRUE) |>
  wrap_plots() &
  scale_y_log10()
Data points may overlap. Use the `jitter` argument to add some amount of
  random variation to the location of data points and avoid overplotting.
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.

## day.brm1 |> ggpredict() |> plot(show_data=TRUE) |> `[[`(1) + scale_y_log10()
day.brm1 |>
  ggemmeans(~TREAT) |>
  plot(show_data = TRUE) +
  scale_y_log10()
Data points may overlap. Use the `jitter` argument to add some amount of
  random variation to the location of data points and avoid overplotting.
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.

day.brm1 |>
  conditional_effects() |>
  plot(points = TRUE)

day.brm1 |>
  conditional_effects() |>
  plot(points = TRUE) |>
  wrap_plots() &
  scale_y_log10()

Conclusions:

  • we see that the range of predictions is fairly wide (ranging from 0 to very high)
day.brm2 |>
  ggpredict() |>
  plot(show_data = TRUE)
Data points may overlap. Use the `jitter` argument to add some amount of
  random variation to the location of data points and avoid overplotting.

day.brm2 |>
  ggpredict() |>
  plot(show_data = TRUE) |>
  wrap_plots() &
  scale_y_log10()
Data points may overlap. Use the `jitter` argument to add some amount of
  random variation to the location of data points and avoid overplotting.
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.

day.brm2 |>
  ggemmeans(~TREAT) |>
  plot(show_data = TRUE)
Data points may overlap. Use the `jitter` argument to add some amount of
  random variation to the location of data points and avoid overplotting.

day.brm2 |>
  ggemmeans(~TREAT) |>
  plot(show_data = TRUE) |>
  wrap_plots() &
  scale_y_log10()
Data points may overlap. Use the `jitter` argument to add some amount of
  random variation to the location of data points and avoid overplotting.
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.

day.brm2 |>
  conditional_effects() |>
  plot(points = TRUE)

day.brm2 |>
  conditional_effects() |>
  plot(points = TRUE) |>
  wrap_plots() &
  scale_y_log10()

day.brm3 <- day.brm2 |> update(sample_prior = "yes", refresh = 0)
The desired updates require recompiling the model
Compiling Stan program...
Start sampling
day.brm3 |> get_variables()
 [1] "b_Intercept"     "b_TREATALG2"     "b_TREATNB"       "b_TREATS"       
 [5] "Intercept"       "prior_Intercept" "prior_b"         "lprior"         
 [9] "lp__"            "accept_stat__"   "stepsize__"      "treedepth__"    
[13] "n_leapfrog__"    "divergent__"     "energy__"       
day.brm3 |>
  hypothesis("TREATALG2<0") |>
  plot()

day.brm3 |>
  hypothesis("TREATNB<0") |>
  plot()

day.brm3 |>
  hypothesis("TREATS<0") |>
  plot()

day.brm3 |> SUYR_prior_and_posterior()

day.brm3 |> standata()
$N
[1] 20

$Y
 [1] 27 19 18 23 25 24 33 27 26 32  9 13 17 14 22 12  8 15 20 11

$K
[1] 4

$Kc
[1] 3

$X
   Intercept TREATALG2 TREATNB TREATS
1          1         0       0      0
2          1         0       0      0
3          1         0       0      0
4          1         0       0      0
5          1         0       0      0
6          1         1       0      0
7          1         1       0      0
8          1         1       0      0
9          1         1       0      0
10         1         1       0      0
11         1         0       1      0
12         1         0       1      0
13         1         0       1      0
14         1         0       1      0
15         1         0       1      0
16         1         0       0      1
17         1         0       0      1
18         1         0       0      1
19         1         0       0      1
20         1         0       0      1
attr(,"assign")
[1] 0 1 1 1
attr(,"contrasts")
attr(,"contrasts")$TREAT
     ALG2 NB S
ALG1    0  0 0
ALG2    1  0 0
NB      0  1 0
S       0  0 1


$prior_only
[1] 0

attr(,"class")
[1] "standata" "list"    
day.brm3 |> stancode()
// generated with brms 2.22.0
functions {
}
data {
  int<lower=1> N;  // total number of observations
  array[N] int Y;  // response variable
  int<lower=1> K;  // number of population-level effects
  matrix[N, K] X;  // population-level design matrix
  int<lower=1> Kc;  // number of population-level effects after centering
  int prior_only;  // should the likelihood be ignored?
}
transformed data {
  matrix[N, Kc] Xc;  // centered version of X without an intercept
  vector[Kc] means_X;  // column means of X before centering
  for (i in 2:K) {
    means_X[i - 1] = mean(X[, i]);
    Xc[, i - 1] = X[, i] - means_X[i - 1];
  }
}
parameters {
  vector[Kc] b;  // regression coefficients
  real Intercept;  // temporary intercept for centered predictors
}
transformed parameters {
  real lprior = 0;  // prior contributions to the log posterior
  lprior += normal_lpdf(b | 0, 2.4);
  lprior += normal_lpdf(Intercept | 3.1, 2.2);
}
model {
  // likelihood including constants
  if (!prior_only) {
    target += poisson_log_glm_lpmf(Y | Xc, Intercept, b);
  }
  // priors including constants
  target += lprior;
}
generated quantities {
  // actual population-level intercept
  real b_Intercept = Intercept - dot_product(means_X, b);
  // additionally sample draws from priors
  real prior_b = normal_rng(0,2.4);
  real prior_Intercept = normal_rng(3.1,2.2);
}

6 MCMC sampling diagnostics

The bayesplot package offers a range of MCMC diagnostics as well as Posterior Probability Checks (PPC), all of which have a convenient plot() interface. Lets start with the MCMC diagnostics.

See list of available diagnostics by name
available_mcmc()
bayesplot MCMC module:
  mcmc_acf
  mcmc_acf_bar
  mcmc_areas
  mcmc_areas_ridges
  mcmc_combo
  mcmc_dens
  mcmc_dens_chains
  mcmc_dens_overlay
  mcmc_hex
  mcmc_hist
  mcmc_hist_by_chain
  mcmc_intervals
  mcmc_neff
  mcmc_neff_hist
  mcmc_nuts_acceptance
  mcmc_nuts_divergence
  mcmc_nuts_energy
  mcmc_nuts_stepsize
  mcmc_nuts_treedepth
  mcmc_pairs
  mcmc_parcoord
  mcmc_rank_ecdf
  mcmc_rank_hist
  mcmc_rank_overlay
  mcmc_recover_hist
  mcmc_recover_intervals
  mcmc_recover_scatter
  mcmc_rhat
  mcmc_rhat_hist
  mcmc_scatter
  mcmc_trace
  mcmc_trace_highlight
  mcmc_violin

Of these, we will focus on:

  • mcmc_trace: this plots the estimates of each parameter over the post-warmup length of each MCMC chain. Each chain is plotted in a different shade of blue, with each parameter in its own facet. Ideally, each trace should just look like noise without any discernible drift and each of the traces for a specific parameter should look the same (i.e, should not be displaced above or below any other trace for that parameter).
plot(day.rstanarm3, plotfun = "mcmc_trace")

The chains appear well mixed and very similar

  • acf (auto-correlation function): plots the auto-correlation between successive MCMC sample lags for each parameter and each chain
plot(day.rstanarm3, "acf_bar")

There is no evidence of auto-correlation in the MCMC samples

  • Rhat: Rhat is a measure of convergence between the chains. The closer the values are to 1, the more the chains have converged. Values greater than 1.05 indicate a lack of convergence. There will be an Rhat value for each parameter estimated.
plot(day.rstanarm3, "rhat_hist")
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

All Rhat values are below 1.05, suggesting the chains have converged.

  • neff (number of effective samples): the ratio of the number of effective samples (those not rejected by the sampler) to the number of samples provides an indication of the effectiveness (and efficiency) of the MCMC sampler. Ratios that are less than 0.5 for a parameter suggest that the sampler spent considerable time in difficult areas of the sampling domain and rejected more than half of the samples (replacing them with the previous effective sample).

    If the ratios are low, tightening the priors may help.

plot(day.rstanarm3, "neff_hist")
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Ratios all very high.

More diagnostics
plot(day.rstanarm3, "combo")

plot(day.rstanarm3, "violin")

The rstan package offers a range of MCMC diagnostics. Lets start with the MCMC diagnostics.

Of these, we will focus on:

  • stan_trace: this plots the estimates of each parameter over the post-warmup length of each MCMC chain. Each chain is plotted in a different colour, with each parameter in its own facet. Ideally, each trace should just look like noise without any discernible drift and each of the traces for a specific parameter should look the same (i.e, should not be displaced above or below any other trace for that parameter).
stan_trace(day.rstanarm3)

The chains appear well mixed and very similar

  • stan_acf (autocorrelation function): plots the autocorrelation between successive MCMC sample lags for each parameter and each chain
stan_ac(day.rstanarm3)

There is no evidence of auto-correlation in the MCMC samples

  • stan_rhat: Rhat is a scale reduction factor measure of convergence between the chains. The closer the values are to 1, the more the chains have converged. Values greater than 1.05 indicate a lack of convergence. There will be an Rhat value for each parameter estimated.
stan_rhat(day.rstanarm3)
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

All Rhat values are below 1.05, suggesting the chains have converged.

  • stan_ess (number of effective samples): the ratio of the number of effective samples (those not rejected by the sampler) to the number of samples provides an indication of the effectiveness (and efficiency) of the MCMC sampler. Ratios that are less than 0.5 for a parameter suggest that the sampler spent considerable time in difficult areas of the sampling domain and rejected more than half of the samples (replacing them with the previous effective sample).

    If the ratios are low, tightening the priors may help.

stan_ess(day.rstanarm3)
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Ratios all very high.

stan_dens(day.rstanarm3, separate_chains = TRUE)

The ggmean package also has a set of MCMC diagnostic functions. Lets start with the MCMC diagnostics.

Of these, we will focus on:

  • ggs_traceplot: this plots the estimates of each parameter over the post-warmup length of each MCMC chain. Each chain is plotted in a different colour, with each parameter in its own facet. Ideally, each trace should just look like noise without any discernible drift and each of the traces for a specific parameter should look the same (i.e, should not be displaced above or below any other trace for that parameter).
day.ggs <- ggs(day.rstanarm3)
ggs_traceplot(day.ggs)

The chains appear well mixed and very similar

  • gss_autocorrelation (autocorrelation function): plots the autocorrelation between successive MCMC sample lags for each parameter and each chain
ggs_autocorrelation(day.ggs)

There is no evidence of auto-correlation in the MCMC samples

  • stan_rhat: Rhat is a scale reduction factor measure of convergence between the chains. The closer the values are to 1, the more the chains have converged. Values greater than 1.05 indicate a lack of convergence. There will be an Rhat value for each parameter estimated.
ggs_Rhat(day.ggs)

All Rhat values are below 1.05, suggesting the chains have converged.

  • stan_ess (number of effective samples): the ratio of the number of effective samples (those not rejected by the sampler) to the number of samples provides an indication of the effectiveness (and efficiency) of the MCMC sampler. Ratios that are less than 0.5 for a parameter suggest that the sampler spent considerable time in difficult areas of the sampling domain and rejected more than half of the samples (replacing them with the previous effective sample).

    If the ratios are low, tightening the priors may help.

ggs_effective(day.ggs)
Warning: Returning more (or less) than 1 row per `summarise()` group was deprecated in
dplyr 1.1.0.
ℹ Please use `reframe()` instead.
ℹ When switching from `summarise()` to `reframe()`, remember that `reframe()`
  always returns an ungrouped data frame and adjust accordingly.
ℹ The deprecated feature was likely used in the ggmcmc package.
  Please report the issue at <https://github.com/xfim/ggmcmc/issues/>.

Ratios all very high.

More diagnostics
ggs_crosscorrelation(day.ggs)

ggs_grb(day.ggs)

6.0.1 stan plots

The brms package offers a range of MCMC diagnostics. Lets start with the MCMC diagnostics.

Of these, we will focus on:

  • stan_trace: this plots the estimates of each parameter over the post-warmup length of each MCMC chain. Each chain is plotted in a different colour, with each parameter in its own facet. Ideally, each trace should just look like noise without any discernible drift and each of the traces for a specific parameter should look the same (i.e, should not be displaced above or below any other trace for that parameter).
day.brm3$fit |> stan_trace()

day.brm3$fit |> stan_trace(inc_warmup = TRUE)

The chains appear well mixed and very similar

  • stan_acf (auto-correlation function): plots the auto-correlation between successive MCMC sample lags for each parameter and each chain
day.brm3$fit |> stan_ac()

There is no evidence of auto-correlation in the MCMC samples

  • stan_rhat: Rhat is a scale reduction factor measure of convergence between the chains. The closer the values are to 1, the more the chains have converged. Values greater than 1.05 indicate a lack of convergence. There will be an Rhat value for each parameter estimated.
day.brm3$fit |> stan_rhat()
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

All Rhat values are below 1.05, suggesting the chains have converged.

  • stan_ess (number of effective samples): the ratio of the number of effective samples (those not rejected by the sampler) to the number of samples provides an indication of the effectiveness (and efficiency) of the MCMC sampler. Ratios that are less than 0.5 for a parameter suggest that the sampler spent considerable time in difficult areas of the sampling domain and rejected more than half of the samples (replacing them with the previous effective sample).

    If the ratios are low, tightening the priors may help.

day.brm3$fit |> stan_ess()
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Ratios all very high.

day.brm3$fit |> stan_dens(separate_chains = TRUE)

7 Model validation

Post predictive checks provide additional diagnostics about the fit of the model. Specifically, they provide a comparison between predictions drawn from the model and the observed data used to train the model.

See list of available diagnostics by name
available_ppc()
bayesplot PPC module:
  ppc_bars
  ppc_bars_grouped
  ppc_boxplot
  ppc_dens
  ppc_dens_overlay
  ppc_dens_overlay_grouped
  ppc_ecdf_overlay
  ppc_ecdf_overlay_grouped
  ppc_error_binned
  ppc_error_hist
  ppc_error_hist_grouped
  ppc_error_scatter
  ppc_error_scatter_avg
  ppc_error_scatter_avg_grouped
  ppc_error_scatter_avg_vs_x
  ppc_freqpoly
  ppc_freqpoly_grouped
  ppc_hist
  ppc_intervals
  ppc_intervals_grouped
  ppc_km_overlay
  ppc_km_overlay_grouped
  ppc_loo_intervals
  ppc_loo_pit_ecdf
  ppc_loo_pit_overlay
  ppc_loo_pit_qq
  ppc_loo_ribbon
  ppc_pit_ecdf
  ppc_pit_ecdf_grouped
  ppc_ribbon
  ppc_ribbon_grouped
  ppc_rootogram
  ppc_scatter
  ppc_scatter_avg
  ppc_scatter_avg_grouped
  ppc_stat
  ppc_stat_2d
  ppc_stat_freqpoly
  ppc_stat_freqpoly_grouped
  ppc_stat_grouped
  ppc_violin_grouped
  • dens_overlay: plots the density distribution of the observed data (black line) overlayed on top of 50 density distributions generated from draws from the model (light blue). Ideally, the 50 realisations should be roughly consistent with the observed data.
pp_check(day.rstanarm3, plotfun = "dens_overlay")

The model draws appear deviate from the observed data.

  • error_scatter_avg: this plots the observed values against the average residuals. Similar to a residual plot, we do not want to see any patterns in this plot. There is some pattern remaining in these residuals.
pp_check(day.rstanarm3, plotfun = "error_scatter_avg")

The predictive error seems to be related to the predictor - the model performs poorest at higher mussel clump areas.

  • error_scatter_avg_vs_x: this is similar to a regular residual plot and as such should be interpreted as such. Again, this is not interpretable for binary data.
pp_check(day.rstanarm3, x = as.numeric(day$TREAT), plotfun = "error_scatter_avg_vs_x")

  • intervals: plots the observed data overlayed on top of posterior predictions associated with each level of the predictor. Ideally, the observed data should all fall within the predictive intervals.
pp_check(day.rstanarm3, x = as.numeric(day$TREAT), plotfun = "intervals")

The modelled predictions seem to underestimate the uncertainty with increasing mussel clump area.

The shinystan package allows the full suite of MCMC diagnostics and posterior predictive checks to be accessed via a web interface.

# library(shinystan)
# launch_shinystan(day.rstanarm3)

DHARMa residuals provide very useful diagnostics. Unfortunately, we cannot directly use the simulateResiduals() function to generate the simulated residuals. However, if we are willing to calculate some of the components yourself, we can still obtain the simulated residuals from the fitted stan model.

We need to supply:

  • simulated (predicted) responses associated with each observation.
  • observed values
  • fitted (predicted) responses (averaged) associated with each observation
preds <- posterior_predict(day.rstanarm3, ndraws = 250, summary = FALSE)
day.resids <- createDHARMa(
  simulatedResponse = t(preds),
  observedResponse = day$BARNACLE,
  fittedPredictedResponse = apply(preds, 2, median),
  integerResponse = TRUE
)
plot(day.resids)

Conclusions:

  • the simulated residuals DOES NOT suggest any issues with the fitted model

Post predictive checks provide additional diagnostics about the fit of the model. Specifically, they provide a comparison between predictions drawn from the model and the observed data used to train the model.

See list of available diagnostics by name
available_ppc()
bayesplot PPC module:
  ppc_bars
  ppc_bars_grouped
  ppc_boxplot
  ppc_dens
  ppc_dens_overlay
  ppc_dens_overlay_grouped
  ppc_ecdf_overlay
  ppc_ecdf_overlay_grouped
  ppc_error_binned
  ppc_error_hist
  ppc_error_hist_grouped
  ppc_error_scatter
  ppc_error_scatter_avg
  ppc_error_scatter_avg_grouped
  ppc_error_scatter_avg_vs_x
  ppc_freqpoly
  ppc_freqpoly_grouped
  ppc_hist
  ppc_intervals
  ppc_intervals_grouped
  ppc_km_overlay
  ppc_km_overlay_grouped
  ppc_loo_intervals
  ppc_loo_pit_ecdf
  ppc_loo_pit_overlay
  ppc_loo_pit_qq
  ppc_loo_ribbon
  ppc_pit_ecdf
  ppc_pit_ecdf_grouped
  ppc_ribbon
  ppc_ribbon_grouped
  ppc_rootogram
  ppc_scatter
  ppc_scatter_avg
  ppc_scatter_avg_grouped
  ppc_stat
  ppc_stat_2d
  ppc_stat_freqpoly
  ppc_stat_freqpoly_grouped
  ppc_stat_grouped
  ppc_violin_grouped
  • dens_overlay: plots the density distribution of the observed data (black line) overlayed on top of 50 density distributions generated from draws from the model (light blue). Ideally, the 50 realisations should be roughly consistent with the observed data.
day.brm3 |> pp_check(type = "dens_overlay", ndraws = 200)

The model draws appear deviate from the observed data.

  • error_scatter_avg: this plots the observed values against the average residuals. Similar to a residual plot, we do not want to see any patterns in this plot. There is some pattern remaining in these residuals.
day.brm3 |> pp_check(type = "error_scatter_avg")
Using all posterior draws for ppc type 'error_scatter_avg' by default.

The predictive error seems to be related to the predictor - the model performs poorest at higher mussel clump areas.

  • intervals: plots the observed data overlayed on top of posterior predictions associated with each level of the predictor. Ideally, the observed data should all fall within the predictive intervals.
day.brm3 |> pp_check(group = "TREAT", type = "intervals")
Using all posterior draws for ppc type 'intervals' by default.
Warning: The following arguments were unrecognized and ignored: group

The modelled predictions seem to underestimate the uncertainty with increasing mussel clump area.

The shinystan package allows the full suite of MCMC diagnostics and posterior predictive checks to be accessed via a web interface.

# library(shinystan)
# launch_shinystan(day.brm3)

DHARMa residuals provide very useful diagnostics. Unfortunately, we cannot directly use the simulateResiduals() function to generate the simulated residuals. However, if we are willing to calculate some of the components yourself, we can still obtain the simulated residuals from the fitted stan model.

We need to supply:

  • simulated (predicted) responses associated with each observation.
  • observed values
  • fitted (predicted) responses (averaged) associated with each observation
preds <- day.brm3 |> posterior_predict(ndraws = 250, summary = FALSE)
day.resids <- createDHARMa(
  simulatedResponse = t(preds),
  observedResponse = day$BARNACLE,
  fittedPredictedResponse = apply(preds, 2, median),
  integerResponse = TRUE
)
day.resids |> plot()

day.resids <- make_brms_dharma_res(day.brm3, integerResponse = TRUE)
wrap_elements(~ testUniformity(day.resids)) +
  wrap_elements(~ plotResiduals(day.resids, form = factor(rep(1, nrow(day))))) +
  wrap_elements(~ plotResiduals(day.resids, quantreg = TRUE)) +
  wrap_elements(~ testDispersion(day.resids))

Conclusions:

  • the simulated residuals DO NOT suggest any issues with the model fit

8 Partial effects plots

day.rstanarm3 |>
  ggpredict() |>
  plot(show_data = TRUE)
Data points may overlap. Use the `jitter` argument to add some amount of
  random variation to the location of data points and avoid overplotting.

day.rstanarm3 |>
  ggemmeans(~TREAT, type = "fixed") |>
  plot(show_data = TRUE)
Data points may overlap. Use the `jitter` argument to add some amount of
  random variation to the location of data points and avoid overplotting.

day.rstanarm3 |>
  epred_draws(newdata = day) |>
  median_hdci() |>
  ggplot(aes(x = TREAT, y = .epred)) +
  geom_pointrange(aes(ymin = .lower, ymax = .upper)) +
  geom_line() +
  geom_point(data = day, aes(y = BARNACLE, x = TREAT))

day.brm3 |>
  conditional_effects() |>
  plot(points = TRUE)

day.brm3 |>
  ggpredict() |>
  plot(show_data = TRUE)
Data points may overlap. Use the `jitter` argument to add some amount of
  random variation to the location of data points and avoid overplotting.

day.brm3 |>
  ggemmeans(~TREAT) |>
  plot(show_data = TRUE)
Data points may overlap. Use the `jitter` argument to add some amount of
  random variation to the location of data points and avoid overplotting.

day.brm3 |>
  epred_draws(newdata = day) |>
  median_hdci() |>
  ggplot(aes(x = TREAT, y = .epred)) +
  geom_pointrange(aes(ymin = .lower, ymax = .upper)) +
  geom_line() +
  geom_point(data = day, aes(y = BARNACLE, x = TREAT))

9 Model investigation

The summary() method generates simple summaries (mean, standard deviation as well as 10, 50 and 90 percentiles).

summary(day.rstanarm3)

Model Info:
 function:     stan_glm
 family:       poisson [log]
 formula:      BARNACLE ~ TREAT
 algorithm:    sampling
 sample:       2400 (posterior sample size)
 priors:       see help('prior_summary')
 observations: 20
 predictors:   4

Estimates:
              mean   sd   10%   50%   90%
(Intercept)  3.1    0.1  3.0   3.1   3.2 
TREATALG2    0.2    0.1  0.1   0.2   0.4 
TREATNB     -0.4    0.1 -0.6  -0.4  -0.2 
TREATS      -0.5    0.2 -0.7  -0.5  -0.3 

Fit Diagnostics:
           mean   sd   10%   50%   90%
mean_PPD 19.7    1.4 18.0  19.7  21.6 

The mean_ppd is the sample average posterior predictive distribution of the outcome variable (for details see help('summary.stanreg')).

MCMC diagnostics
              mcse Rhat n_eff
(Intercept)   0.0  1.0  2235 
TREATALG2     0.0  1.0  2411 
TREATNB       0.0  1.0  2433 
TREATS        0.0  1.0  2425 
mean_PPD      0.0  1.0  2290 
log-posterior 0.0  1.0  2603 

For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).

Conclusions:

  • in the Model Info, we are informed that the total MCMC posterior sample size is 2400 and that there were 20 raw observations.
  • the estimated mean (expected number of newly recruited barnacles) on the ALG1 surface is 3.1. This is the mean of the posterior distribution for this parameter. If we back-transform this to the response scale, this becomes 22.14.
  • the estimated effect of ALG2 vs ALG1 is 0.24 (mean) or 0.09 (median) with a standard deviation of 0. The 90% credibility intervals indicate that we are 90% confident that the slope is between 0.24 and 0.24 - e.g. there is a significant positive effect. When back-transformed onto the response scale, we see that barnacle recruitment on ALG2 is 1.28 times higher than that on ALG1. This represents a 28% increase in barnacle recruitment.
  • the estimated effect of NB and S are -0.39 and -0.52 respectively, which equate to 1.48 and 1.68 fold declines respectively.
  • Rhat and number of effective samples for each parameter are also provided as MCMC diagnostics and all look good.
tidyMCMC(day.rstanarm3$stanfit, estimate.method = "median", conf.int = TRUE, conf.method = "HPDinterval", rhat = TRUE, ess = TRUE)
# A tibble: 6 × 7
  term          estimate std.error conf.low conf.high  rhat   ess
  <chr>            <dbl>     <dbl>    <dbl>     <dbl> <dbl> <int>
1 (Intercept)      3.10     0.0929   2.91       3.28  0.999  2235
2 TREATALG2        0.245    0.123    0.0142     0.486 0.999  2411
3 TREATNB         -0.392    0.146   -0.684     -0.121 0.999  2433
4 TREATS          -0.518    0.152   -0.838     -0.246 1.00   2425
5 mean_PPD        19.7      1.41    17.1       22.6   1.00   2290
6 log-posterior  -62.0      1.36   -64.7      -60.2   1.00   2603

Conclusions:

  • the estimated mean (expected number of newly recruited barnacles) on the ALG1 surface is 3.1. This is the mean of the posterior distribution for this parameter. If we back-transform this to the response scale, this becomes 22.14.
  • the estimated effect of ALG2 vs ALG1 is 0.24 (median) with a standard error of 0.12. The 95% credibility intervals indicate that we are 95% confident that the effect is between 0.01 and 0.49 - e.g. there is a significant positive effect. When back-transformed onto the response scale, we see that barnacle recruitment on ALG2 is 1.28 times higher than that on ALG1. This represents a 28% increase in barnacle recruitment.
  • the estimated effect of NB and S are -0.39 and -0.52 respectively, which equate to 1.48 and 1.68 fold declines respectively.
  • Rhat and number of effective samples for each parameter are also provided as MCMC diagnostics and all look good.
day.rstanarm3$stanfit |>
  summarise_draws(
    median,
    HDInterval::hdi,
    rhat, length, ess_bulk, ess_tail
  )
# A tibble: 6 × 8
  variable       median    lower   upper  rhat length ess_bulk ess_tail
  <chr>           <dbl>    <dbl>   <dbl> <dbl>  <dbl>    <dbl>    <dbl>
1 (Intercept)     3.10    2.91     3.28   1.00   2400    2249.    2402.
2 TREATALG2       0.244   0.0142   0.486  1.00   2400    2410.    2381.
3 TREATNB        -0.394  -0.684   -0.121  1.00   2400    2436.    2275.
4 TREATS         -0.517  -0.838   -0.246  1.00   2400    2467.    2302.
5 mean_PPD       19.7    17.1     22.6    1.00   2400    2315.    2290.
6 log-posterior -61.7   -64.7    -60.2    1.00   2400    2582.    2369.

We can also alter the CI level.

day.rstanarm3$stanfit |>
  summarise_draws(
    median,
    ~ HDInterval::hdi(.x, credMass = 0.9),
    rhat, length, ess_bulk, ess_tail
  )
# A tibble: 6 × 8
  variable       median    lower   upper  rhat length ess_bulk ess_tail
  <chr>           <dbl>    <dbl>   <dbl> <dbl>  <dbl>    <dbl>    <dbl>
1 (Intercept)     3.10    2.95     3.26   1.00   2400    2249.    2402.
2 TREATALG2       0.244   0.0490   0.446  1.00   2400    2410.    2381.
3 TREATNB        -0.394  -0.624   -0.145  1.00   2400    2436.    2275.
4 TREATS         -0.517  -0.760   -0.258  1.00   2400    2467.    2302.
5 mean_PPD       19.7    17.6     22.2    1.00   2400    2315.    2290.
6 log-posterior -61.7   -63.9    -60.2    1.00   2400    2582.    2369.

Arguably, it would be better to back-transform to the ratio scale

day.rstanarm3$stanfit |>
  summarise_draws(
    ~ median(exp(.x)),
    ~ HDInterval::hdi(exp(.x)),
    rhat, length, ess_bulk, ess_tail
  )
# A tibble: 6 × 8
  variable   `~median(exp(.x))`    lower    upper  rhat length ess_bulk ess_tail
  <chr>                   <dbl>    <dbl>    <dbl> <dbl>  <dbl>    <dbl>    <dbl>
1 (Intercep…           2.21e+ 1 1.83e+ 1 2.63e+ 1  1.00   2400    2249.    2402.
2 TREATALG2            1.28e+ 0 9.98e- 1 1.60e+ 0  1.00   2400    2410.    2381.
3 TREATNB              6.75e- 1 4.94e- 1 8.70e- 1  1.00   2400    2436.    2275.
4 TREATS               5.96e- 1 4.33e- 1 7.82e- 1  1.00   2400    2467.    2302.
5 mean_PPD             3.59e+ 8 8.45e+ 6 3.96e+ 9  1.00   2400    2315.    2290.
6 log-poste…           1.53e-27 7.76e-31 5.69e-27  1.00   2400    2582.    2369.
day.rstanarm3$stanfit |> as_draws_df()
# A draws_df: 800 iterations, 3 chains, and 6 variables
   (Intercept) TREATALG2 TREATNB TREATS mean_PPD log-posterior
1          3.2      0.19   -0.35 -0.702       20           -62
2          2.9      0.43   -0.19 -0.095       19           -65
3          3.1      0.34   -0.52 -0.488       20           -61
4          3.2      0.22   -0.50 -0.571       21           -61
5          2.9      0.31   -0.21 -0.439       17           -63
6          3.1      0.32   -0.42 -0.652       17           -61
7          3.1      0.15   -0.37 -0.595       19           -61
8          3.1      0.17   -0.24 -0.474       19           -61
9          3.1      0.32   -0.36 -0.248       21           -63
10         3.1      0.21   -0.28 -0.294       19           -62
# ... with 2390 more draws
# ... hidden reserved variables {'.chain', '.iteration', '.draw'}
day.rstanarm3$stanfit |>
  as_draws_df() |>
  summarise_draws(
    median,
    ~ HDInterval::hdi(.x),
    rhat,
    length,
    ess_bulk, ess_tail
  )
# A tibble: 6 × 8
  variable       median    lower   upper  rhat length ess_bulk ess_tail
  <chr>           <dbl>    <dbl>   <dbl> <dbl>  <dbl>    <dbl>    <dbl>
1 (Intercept)     3.10    2.91     3.28   1.00   2400    2249.    2402.
2 TREATALG2       0.244   0.0142   0.486  1.00   2400    2410.    2381.
3 TREATNB        -0.394  -0.684   -0.121  1.00   2400    2436.    2275.
4 TREATS         -0.517  -0.838   -0.246  1.00   2400    2467.    2302.
5 mean_PPD       19.7    17.1     22.6    1.00   2400    2315.    2290.
6 log-posterior -61.7   -64.7    -60.2    1.00   2400    2582.    2369.
day.rstanarm3$stanfit |>
  as_draws_df() |>
  exp() |>
  summarise_draws(
    median,
    ~ HDInterval::hdi(.x),
    rhat,
    length,
    ess_bulk, ess_tail
  )
# A tibble: 6 × 8
  variable        median    lower    upper  rhat length ess_bulk ess_tail
  <chr>            <dbl>    <dbl>    <dbl> <dbl>  <dbl>    <dbl>    <dbl>
1 (Intercept)   2.21e+ 1 1.83e+ 1 2.63e+ 1  1.00   2400    2249.    2402.
2 TREATALG2     1.28e+ 0 9.98e- 1 1.60e+ 0  1.00   2400    2410.    2381.
3 TREATNB       6.75e- 1 4.94e- 1 8.70e- 1  1.00   2400    2436.    2275.
4 TREATS        5.96e- 1 4.33e- 1 7.82e- 1  1.00   2400    2467.    2302.
5 mean_PPD      3.59e+ 8 8.45e+ 6 3.96e+ 9  1.00   2400    2315.    2290.
6 log-posterior 1.53e-27 7.76e-31 5.69e-27  1.00   2400    2582.    2369.

Due to the presence of a log transform in the predictor, it is better to use the regex version.

day.rstanarm3 |> get_variables()
 [1] "(Intercept)"   "TREATALG2"     "TREATNB"       "TREATS"       
 [5] "accept_stat__" "stepsize__"    "treedepth__"   "n_leapfrog__" 
 [9] "divergent__"   "energy__"     
day.draw <- day.rstanarm3 |> gather_draws(`.Intercept.*|.*TREAT.*`, regex = TRUE)
day.draw
# A tibble: 9,600 × 5
# Groups:   .variable [4]
   .chain .iteration .draw .variable   .value
    <int>      <int> <int> <chr>        <dbl>
 1      1          1     1 (Intercept)   3.20
 2      1          2     2 (Intercept)   2.87
 3      1          3     3 (Intercept)   3.07
 4      1          4     4 (Intercept)   3.16
 5      1          5     5 (Intercept)   2.92
 6      1          6     6 (Intercept)   3.11
 7      1          7     7 (Intercept)   3.14
 8      1          8     8 (Intercept)   3.07
 9      1          9     9 (Intercept)   3.08
10      1         10    10 (Intercept)   3.06
# ℹ 9,590 more rows
exceedP <- function(x, Val = 0) mean(x > Val)

day.rstanarm3 |>
  gather_draws(`.Intercept.*|.*TREAT.*`, regex = TRUE) |>
  mutate(.value = exp(.value)) |>
  summarise_draws(
    median,
    HDInterval::hdi,
    rhat,
    length,
    ess_bulk,
    ess_tail,
    ~ exceedP(.x, 1)
  )
# A tibble: 4 × 10
# Groups:   .variable [4]
  .variable   variable median  lower  upper  rhat length ess_bulk ess_tail
  <chr>       <chr>     <dbl>  <dbl>  <dbl> <dbl>  <dbl>    <dbl>    <dbl>
1 (Intercept) .value   22.1   18.3   26.3    1.00   2400    2249.    2402.
2 TREATALG2   .value    1.28   0.998  1.60   1.00   2400    2410.    2381.
3 TREATNB     .value    0.675  0.494  0.870  1.00   2400    2436.    2275.
4 TREATS      .value    0.596  0.433  0.782  1.00   2400    2467.    2302.
# ℹ 1 more variable: `~exceedP(.x, 1)` <dbl>

We can then summarise this

day.draw |> median_hdci(.value)
# A tibble: 4 × 7
  .variable   .value   .lower .upper .width .point .interval
  <chr>        <dbl>    <dbl>  <dbl>  <dbl> <chr>  <chr>    
1 (Intercept)  3.10   2.91     3.27    0.95 median hdci     
2 TREATALG2    0.244  0.00934  0.482   0.95 median hdci     
3 TREATNB     -0.394 -0.684   -0.121   0.95 median hdci     
4 TREATS      -0.517 -0.833   -0.241   0.95 median hdci     

We could alternatively express the parameters on the response scale.

day.draw |> median_hdci(exp(.value))
# A tibble: 4 × 7
  .variable   `exp(.value)` .lower .upper .width .point .interval
  <chr>               <dbl>  <dbl>  <dbl>  <dbl> <chr>  <chr>    
1 (Intercept)        22.1   18.3   26.3     0.95 median hdci     
2 TREATALG2           1.28   0.998  1.60    0.95 median hdci     
3 TREATNB             0.675  0.494  0.870   0.95 median hdci     
4 TREATS              0.596  0.421  0.772   0.95 median hdci     

Conclusions:

  • the estimated mean (expected number of newly recruited barnacles) on the ALG1 surface is 3.1. This is the mean of the posterior distribution for this parameter. If we back-transform this to the response scale, this becomes 22.11.
  • the estimated effect of ALG2 vs ALG1 is 0.24 (median) with a standard error of 0.01. The 95% credibility intervals indicate that we are 95% confident that the effect is between 0.48 and 0.95 - e.g. there is a significant positive effect. When back-transformed onto the response scale, we see that barnacle recruitment on ALG2 is 1.28 times higher than that on ALG1. This represents a 28% increase in barnacle recruitment.
  • the estimated effect of NB and S are -0.39 and -0.52 respectively, which equate to 1.48 and 1.68 fold declines respectively.
  • Rhat and number of effective samples for each parameter are also provided as MCMC diagnostics and all look good.
day.rstanarm3 |> plot(plotfun = "mcmc_intervals")

day.rstanarm3 |>
  gather_draws(`.Intercept.*|.*TREAT.*`, regex = TRUE) |>
  ggplot() +
  stat_halfeye(aes(x = .value, y = .variable)) +
  facet_wrap(~.variable, scales = "free")

day.rstanarm3 |>
  gather_draws(`.*TREAT.*`, regex = TRUE) |>
  ggplot() +
  stat_halfeye(aes(x = .value, y = .variable)) +
  geom_vline(xintercept = 0, linetype = "dashed")

day.rstanarm3 |>
  gather_draws(`.*TREAT.*`, regex = TRUE) |>
  ggplot() +
  stat_halfeye(aes(x = exp(.value), y = .variable)) +
  geom_vline(xintercept = 1, linetype = "dashed") +
  scale_x_continuous(trans = scales::log2_trans())

day.rstanarm3 |>
  gather_draws(`.*TREAT.*`, regex = TRUE) |>
  ggplot() +
  geom_density_ridges(aes(x = .value, y = .variable), alpha = 0.4) +
  geom_vline(xintercept = 0, linetype = "dashed")
Picking joint bandwidth of 0.0265

## Or on a fractional scale
day.rstanarm3 |>
  gather_draws(`.*TREAT.*`, regex = TRUE) |>
  ggplot() +
  geom_density_ridges_gradient(
    aes(
      x = exp(.value),
      y = .variable,
      fill = stat(x)
    ),
    alpha = 0.4, colour = "white",
    quantile_lines = TRUE,
    quantiles = c(0.025, 0.975)
  ) +
  geom_vline(xintercept = 1, linetype = "dashed") +
  scale_x_continuous(trans = scales::log2_trans()) +
  scale_fill_viridis_c(option = "C")
Warning: `stat(x)` was deprecated in ggplot2 3.4.0.
ℹ Please use `after_stat(x)` instead.
Picking joint bandwidth of 0.0382

This is purely a graphical depiction on the posteriors.

day.rstanarm3 |> tidy_draws()
# A tibble: 2,400 × 13
   .chain .iteration .draw `(Intercept)` TREATALG2 TREATNB  TREATS accept_stat__
    <int>      <int> <int>         <dbl>     <dbl>   <dbl>   <dbl>         <dbl>
 1      1          1     1          3.20     0.187  -0.353 -0.702          0.989
 2      1          2     2          2.87     0.428  -0.187 -0.0954         0.949
 3      1          3     3          3.07     0.340  -0.518 -0.488          0.976
 4      1          4     4          3.16     0.221  -0.503 -0.571          0.988
 5      1          5     5          2.92     0.313  -0.212 -0.439          0.991
 6      1          6     6          3.11     0.322  -0.417 -0.652          0.925
 7      1          7     7          3.14     0.150  -0.374 -0.595          0.969
 8      1          8     8          3.07     0.170  -0.243 -0.474          1    
 9      1          9     9          3.08     0.323  -0.364 -0.248          0.842
10      1         10    10          3.06     0.206  -0.279 -0.294          1    
# ℹ 2,390 more rows
# ℹ 5 more variables: stepsize__ <dbl>, treedepth__ <dbl>, n_leapfrog__ <dbl>,
#   divergent__ <dbl>, energy__ <dbl>
day.rstanarm3 |> spread_draws(`.Intercept.*|.*TREAT.*`, regex = TRUE)
# A tibble: 2,400 × 7
   .chain .iteration .draw `(Intercept)` TREATALG2 TREATNB  TREATS
    <int>      <int> <int>         <dbl>     <dbl>   <dbl>   <dbl>
 1      1          1     1          3.20     0.187  -0.353 -0.702 
 2      1          2     2          2.87     0.428  -0.187 -0.0954
 3      1          3     3          3.07     0.340  -0.518 -0.488 
 4      1          4     4          3.16     0.221  -0.503 -0.571 
 5      1          5     5          2.92     0.313  -0.212 -0.439 
 6      1          6     6          3.11     0.322  -0.417 -0.652 
 7      1          7     7          3.14     0.150  -0.374 -0.595 
 8      1          8     8          3.07     0.170  -0.243 -0.474 
 9      1          9     9          3.08     0.323  -0.364 -0.248 
10      1         10    10          3.06     0.206  -0.279 -0.294 
# ℹ 2,390 more rows
day.rstanarm3 |>
  posterior_samples() |>
  as_tibble()
Warning: Method 'posterior_samples' is deprecated. Please see ?as_draws for
recommended alternatives.
# A tibble: 2,400 × 4
   `(Intercept)` TREATALG2 TREATNB  TREATS
           <dbl>     <dbl>   <dbl>   <dbl>
 1          3.20     0.187  -0.353 -0.702 
 2          2.87     0.428  -0.187 -0.0954
 3          3.07     0.340  -0.518 -0.488 
 4          3.16     0.221  -0.503 -0.571 
 5          2.92     0.313  -0.212 -0.439 
 6          3.11     0.322  -0.417 -0.652 
 7          3.14     0.150  -0.374 -0.595 
 8          3.07     0.170  -0.243 -0.474 
 9          3.08     0.323  -0.364 -0.248 
10          3.06     0.206  -0.279 -0.294 
# ℹ 2,390 more rows

Unfortunately, \(R^2\) calculations for models other than Gaussian and Binomial have not yet been implemented for rstanarm models yet.

# day.rstanarm3 |> bayes_R2() |> median_hdci

The summary() method generates simple summaries (mean, standard deviation as well as 10, 50 and 90 percentiles).

day.brm3 |> summary()
 Family: poisson 
  Links: mu = log 
Formula: BARNACLE ~ TREAT 
   Data: day (Number of observations: 20) 
  Draws: 3 chains, each with iter = 5000; warmup = 1000; thin = 5;
         total post-warmup draws = 2400

Regression Coefficients:
          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept     3.10      0.09     2.91     3.28 1.00     2017     2179
TREATALG2     0.24      0.13    -0.01     0.49 1.00     1976     2253
TREATNB      -0.40      0.15    -0.68    -0.10 1.00     2247     2252
TREATS       -0.53      0.16    -0.83    -0.21 1.00     2523     2452

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Conclusions:

  • in the Model Info, we are informed that the total MCMC posterior sample size is 2400 and that there were 20 raw observations.
  • the estimated mean (expected number of newly recruited barnacles) on the ALG1 surface is 3.1. This is the mean of the posterior distribution for this parameter. If we back-transform this to the response scale, this becomes 22.22.
  • the estimated effect of ALG2 vs ALG1 is 0.24 (mean) or 0.49 (median) with a standard deviation of 0.13. The 90% credibility intervals indicate that we are 90% confident that the slope is between 0.24 and 1 - e.g. there is a significant positive effect. When back-transformed onto the response scale, we see that barnacle recruitment on ALG2 is 1.27 times higher than that on ALG1. This represents a 27% increase in barnacle recruitment.
  • the estimated effect of NB and S are -0.4 and -0.53 respectively, which equate to 1.48 and 1.69 fold decline respectively.
  • Rhat and number of effective samples for each parameter are also provided as MCMC diagnostics and all look good.
day.brm3$fit |> tidyMCMC(
  estimate.method = "median",
  conf.int = TRUE,
  conf.method = "HPDinterval",
  rhat = TRUE,
  ess = TRUE
)
# A tibble: 8 × 7
  term            estimate std.error conf.low conf.high  rhat   ess
  <chr>              <dbl>     <dbl>    <dbl>     <dbl> <dbl> <int>
1 b_Intercept       3.10      0.0944   2.91      3.28   1.00   1972
2 b_TREATALG2       0.241     0.127   -0.0154    0.486  1.00   1964
3 b_TREATNB        -0.395     0.149   -0.684    -0.0962 1.00   2240
4 b_TREATS         -0.526     0.156   -0.817    -0.199  1.000  2519
5 Intercept         2.93      0.0533   2.83      3.04   0.999  2491
6 prior_Intercept   3.03      2.23    -1.39      7.23   1.00   2338
7 prior_b           0.0222    2.39    -4.45      4.85   1.00   2361
8 lprior           -7.14      0.0191  -7.18     -7.11   0.999  2509

Conclusions:

  • the estimated mean (expected number of newly recruited barnacles) on the ALG1 surface is 3.1. This is the mean of the posterior distribution for this parameter. If we back-transform this to the response scale, this becomes 22.22.
  • the estimated effect of ALG2 vs ALG1 is 0.24 (median) with a standard error of 0.13. The 95% credibility intervals indicate that we are 95% confident that the effect is between -0.02 and 0.49 - e.g. there is a significant positive effect. When back-transformed onto the response scale, we see that barnacle recruitment on ALG2 is 1.27 times higher than that on ALG1. This represents a 27% increase in barnacle recruitment.
  • the estimated effect of NB and S are -0.4 and -0.53 respectively, which equate to 1.48 and 1.69 fold declines respectively.
  • Rhat and number of effective samples for each parameter are also provided as MCMC diagnostics and all look good.
day.brm3 |> as_draws_df()
# A draws_df: 800 iterations, 3 chains, and 9 variables
   b_Intercept b_TREATALG2 b_TREATNB b_TREATS Intercept prior_Intercept prior_b
1          3.1        0.26     -0.36    -0.50       3.0             1.1    0.98
2          3.0        0.30     -0.20    -0.23       3.0             3.6    1.68
3          3.1        0.35     -0.44    -0.51       2.9             1.9   -2.24
4          3.3        0.16     -0.48    -0.69       3.0             4.2    1.55
5          3.2        0.20     -0.60    -0.24       3.0             1.4   -1.28
6          3.2        0.13     -0.52    -0.82       2.9             4.7    5.52
7          3.0        0.44     -0.37    -0.74       2.8             1.4    1.55
8          3.2        0.21     -0.34    -0.61       3.0             5.1   -1.30
9          3.2        0.29     -0.41    -0.45       3.0             1.1   -1.32
10         3.2        0.14     -0.51    -0.54       3.0             6.5    2.64
   lprior
1    -7.1
2    -7.1
3    -7.1
4    -7.2
5    -7.1
6    -7.2
7    -7.2
8    -7.1
9    -7.1
10   -7.1
# ... with 2390 more draws, and 1 more variables
# ... hidden reserved variables {'.chain', '.iteration', '.draw'}
day.brm3 |>
  as_draws_df() |>
  summarise_draws(
    median,
    HDInterval::hdi,
    rhat,
    length,
    ess_bulk, ess_tail
  )
# A tibble: 9 × 8
  variable            median    lower    upper  rhat length ess_bulk ess_tail
  <chr>                <dbl>    <dbl>    <dbl> <dbl>  <dbl>    <dbl>    <dbl>
1 b_Intercept       3.10       2.91     3.28   1.00    2400    2018.    2179.
2 b_TREATALG2       0.242     -0.0154   0.486  1.00    2400    1976.    2253.
3 b_TREATNB        -0.396     -0.684   -0.0962 1.00    2400    2247.    2252.
4 b_TREATS         -0.527     -0.817   -0.199  1.000   2400    2523.    2452.
5 Intercept         2.93       2.83     3.04   0.999   2400    2483.    2349.
6 prior_Intercept   2.99      -1.39     7.23   1.00    2400    2321.    2305.
7 prior_b           0.000167  -4.45     4.85   1.00    2400    2365.    2023.
8 lprior           -7.14      -7.18    -7.11   0.999   2400    2499.    2328.
9 lp__            -65.0      -68.3    -63.4    1.00    2400    2357.    2225.
day.brm3 |>
  as_draws_df() |>
  exp() |>
  summarise_draws(
    median,
    HDInterval::hdi,
    rhat,
    length,
    ess_bulk, ess_tail
  )
# A tibble: 9 × 8
  variable          median    lower    upper  rhat length ess_bulk ess_tail
  <chr>              <dbl>    <dbl>    <dbl> <dbl>  <dbl>    <dbl>    <dbl>
1 b_Intercept     2.23e+ 1 1.83e+ 1 2.66e+ 1 1.00    2400    2018.    2179.
2 b_TREATALG2     1.27e+ 0 9.76e- 1 1.62e+ 0 1.00    2400    1976.    2253.
3 b_TREATNB       6.73e- 1 4.89e- 1 8.91e- 1 1.00    2400    2247.    2252.
4 b_TREATS        5.90e- 1 4.21e- 1 7.86e- 1 1.000   2400    2523.    2452.
5 Intercept       1.87e+ 1 1.67e+ 1 2.06e+ 1 0.999   2400    2483.    2349.
6 prior_Intercept 1.99e+ 1 1.12e- 2 9.78e+ 2 1.00    2400    2321.    2305.
7 prior_b         1.00e+ 0 6.63e- 4 5.45e+ 1 1.00    2400    2365.    2023.
8 lprior          7.93e- 4 7.62e- 4 8.18e- 4 0.999   2400    2499.    2328.
9 lp__            6.09e-29 6.53e-33 2.27e-28 1.00    2400    2357.    2225.

Due to the presence of a log transform in the predictor, it is better to use the regex version.

day.brm3 |> get_variables()
 [1] "b_Intercept"     "b_TREATALG2"     "b_TREATNB"       "b_TREATS"       
 [5] "Intercept"       "prior_Intercept" "prior_b"         "lprior"         
 [9] "lp__"            "accept_stat__"   "stepsize__"      "treedepth__"    
[13] "n_leapfrog__"    "divergent__"     "energy__"       
day.draw <- day.brm3 |>
  gather_draws(`.Intercept.*|.*TREAT.*`, regex = TRUE)
day.draw
# A tibble: 7,200 × 5
# Groups:   .variable [3]
   .chain .iteration .draw .variable   .value
    <int>      <int> <int> <chr>        <dbl>
 1      1          1     1 b_TREATALG2  0.256
 2      1          2     2 b_TREATALG2  0.301
 3      1          3     3 b_TREATALG2  0.353
 4      1          4     4 b_TREATALG2  0.165
 5      1          5     5 b_TREATALG2  0.201
 6      1          6     6 b_TREATALG2  0.135
 7      1          7     7 b_TREATALG2  0.444
 8      1          8     8 b_TREATALG2  0.213
 9      1          9     9 b_TREATALG2  0.295
10      1         10    10 b_TREATALG2  0.142
# ℹ 7,190 more rows
day.brm3 |>
  gather_draws(`.Intercept.*|.*TREAT.*`, regex = TRUE) |>
  mutate(.value = exp(.value)) |>
  summarise_draws(
    median,
    ~ HDInterval::hdi(.x, credMass = 0.95),
    rhat,
    length,
    ess_bulk, ess_tail
  )
# A tibble: 3 × 9
# Groups:   .variable [3]
  .variable   variable median lower upper  rhat length ess_bulk ess_tail
  <chr>       <chr>     <dbl> <dbl> <dbl> <dbl>  <dbl>    <dbl>    <dbl>
1 b_TREATALG2 .value    1.27  0.976 1.62  1.00    2400    1976.    2253.
2 b_TREATNB   .value    0.673 0.489 0.891 1.00    2400    2247.    2252.
3 b_TREATS    .value    0.590 0.421 0.786 1.000   2400    2523.    2452.
exceedP <- function(x, Val = 0) mean(x > Val)
day.brm3 |>
  tidy_draws() |>
  exp() |>
  dplyr::select(starts_with("b_")) |>
  summarise_draws(
    median,
    ~ HDInterval::hdi(.x, credMass = 0.9),
    rhat,
    ess_bulk, ess_tail,
    ~ exceedP(.x, 1)
  )
# A tibble: 4 × 8
  variable    median  lower  upper  rhat ess_bulk ess_tail `~exceedP(.x, 1)`
  <chr>        <dbl>  <dbl>  <dbl> <dbl>    <dbl>    <dbl>             <dbl>
1 b_Intercept 22.3   18.7   25.5   1.00     1975.    2174.          1       
2 b_TREATALG2  1.27   1.00   1.54  1.00     1945.    2246.          0.969   
3 b_TREATNB    0.673  0.519  0.848 1.000    2241.    2246.          0.00542 
4 b_TREATS     0.590  0.445  0.745 1.000    2512.    2447.          0.000833

We can then summarise this

day.draw |> median_hdci(.value)
# A tibble: 3 × 7
  .variable   .value  .lower  .upper .width .point .interval
  <chr>        <dbl>   <dbl>   <dbl>  <dbl> <chr>  <chr>    
1 b_TREATALG2  0.242 -0.0129  0.491    0.95 median hdci     
2 b_TREATNB   -0.396 -0.684  -0.0962   0.95 median hdci     
3 b_TREATS    -0.527 -0.813  -0.195    0.95 median hdci     

We could alternatively express the parameters on the response scale.

day.draw |>
  median_hdci(exp(.value))
# A tibble: 3 × 7
  .variable   `exp(.value)` .lower .upper .width .point .interval
  <chr>               <dbl>  <dbl>  <dbl>  <dbl> <chr>  <chr>    
1 b_TREATALG2         1.27   0.978  1.62    0.95 median hdci     
2 b_TREATNB           0.673  0.485  0.888   0.95 median hdci     
3 b_TREATS            0.590  0.421  0.786   0.95 median hdci     
day.brm3 |>
  gather_draws(`.Intercept.*|.*TREAT.*`, regex = TRUE) |>
  ggplot() +
  stat_halfeye(aes(x = .value, y = .variable)) +
  facet_wrap(~.variable, scales = "free")

Conclusions:

  • the estimated mean (expected number of newly recruited barnacles) on the ALG1 surface is 0.24. This is the mean of the posterior distribution for this parameter. If we back-transform this to the response scale, this becomes 1.27.
  • the estimated effect of ALG2 vs ALG1 is -0.4 (median) with a standard error of -0.68. The 95% credibility intervals indicate that we are 95% confident that the effect is between -0.1 and 0.95 - e.g. there is a significant positive effect. When back-transformed onto the response scale, we see that barnacle recruitment on ALG2 is 0.67 times higher than that on ALG1. This represents a -33% increase in barnacle recruitment.
  • the estimated effect of NB and S are -0.53 and NA respectively, which equate to 1.69 and NA fold declines respectively.
  • Rhat and number of effective samples for each parameter are also provided as MCMC diagnostics and all look good.
day.brm3$fit |> plot(type = "intervals")
ci_level: 0.8 (80% intervals)
outer_level: 0.95 (95% intervals)

## Link scale
day.brm3 |>
  gather_draws(`.Intercept.*|.*TREAT.*`, regex = TRUE) |>
  ggplot() +
  stat_slab(aes(
    x = .value, y = .variable,
    fill = stat(ggdist::cut_cdf_qi(cdf,
      .width = c(0.5, 0.8, 0.95),
      labels = scales::percent_format()
    ))
  ), color = "black") +
  geom_vline(xintercept = 0, linetype = "dashed") +
  scale_fill_brewer("Interval", direction = -1, na.translate = FALSE)

## Fractional scale
day.brm3 |>
  gather_draws(`.Intercept.*|.*TREAT.*`, regex = TRUE) |>
  mutate(.value = exp(.value)) |>
  ggplot() +
  stat_slab(aes(
    x = .value, y = .variable,
    fill = stat(ggdist::cut_cdf_qi(cdf,
      .width = c(0.5, 0.8, 0.95),
      labels = scales::percent_format()
    ))
  ), color = "black") +
  geom_vline(xintercept = 1, linetype = "dashed") +
  scale_fill_brewer("Interval", direction = -1, na.translate = FALSE) +
  scale_x_continuous(trans = scales::log2_trans())

day.brm3 |>
  gather_draws(`.Intercept.*|.*TREAT.*`, regex = TRUE) |>
  ggplot() +
  stat_halfeye(aes(x = .value, y = .variable)) +
  facet_wrap(~.variable, scales = "free")

day.brm3 |>
  gather_draws(`.*TREAT.*`, regex = TRUE) |>
  ggplot() +
  stat_halfeye(aes(x = .value, y = .variable)) +
  geom_vline(xintercept = 0, linetype = "dashed")

day.brm3 |>
  gather_draws(`.*TREAT.*`, regex = TRUE) |>
  ggplot() +
  stat_halfeye(aes(x = exp(.value), y = .variable)) +
  geom_vline(xintercept = 1, linetype = "dashed") +
  scale_x_continuous(trans = scales::log2_trans())

day.brm3 |>
  gather_draws(`.*TREAT.*`, regex = TRUE) |>
  ggplot() +
  geom_density_ridges(aes(x = .value, y = .variable), alpha = 0.4) +
  geom_vline(xintercept = 0, linetype = "dashed")
Picking joint bandwidth of 0.0268

## Or on a fractional scale
day.brm3 |>
  gather_draws(`.*TREAT.*`, regex = TRUE) |>
  ggplot() +
  geom_density_ridges_gradient(
    aes(
      x = exp(.value),
      y = .variable,
      fill = stat(x)
    ),
    alpha = 0.4, colour = "white",
    quantile_lines = TRUE,
    quantiles = c(0.025, 0.975)
  ) +
  geom_vline(xintercept = 1, linetype = "dashed") +
  scale_x_continuous(trans = scales::log2_trans()) +
  scale_fill_viridis_c(option = "C")
Picking joint bandwidth of 0.0387

This is purely a graphical depiction on the posteriors.

day.brm3 |> tidy_draws()
# A tibble: 2,400 × 18
   .chain .iteration .draw b_Intercept b_TREATALG2 b_TREATNB b_TREATS Intercept
    <int>      <int> <int>       <dbl>       <dbl>     <dbl>    <dbl>     <dbl>
 1      1          1     1        3.15       0.256    -0.359   -0.497      3.00
 2      1          2     2        3.02       0.301    -0.196   -0.227      2.99
 3      1          3     3        3.08       0.353    -0.437   -0.511      2.93
 4      1          4     4        3.26       0.165    -0.481   -0.693      3.00
 5      1          5     5        3.17       0.201    -0.596   -0.238      3.02
 6      1          6     6        3.19       0.135    -0.520   -0.820      2.89
 7      1          7     7        3.01       0.444    -0.368   -0.736      2.84
 8      1          8     8        3.19       0.213    -0.341   -0.609      3.01
 9      1          9     9        3.18       0.295    -0.405   -0.453      3.04
10      1         10    10        3.22       0.142    -0.514   -0.538      3.00
# ℹ 2,390 more rows
# ℹ 10 more variables: prior_Intercept <dbl>, prior_b <dbl>, lprior <dbl>,
#   lp__ <dbl>, accept_stat__ <dbl>, stepsize__ <dbl>, treedepth__ <dbl>,
#   n_leapfrog__ <dbl>, divergent__ <dbl>, energy__ <dbl>
day.brm3 |> spread_draws(`.Intercept.*|.*TREAT.*`, regex = TRUE)
# A tibble: 2,400 × 6
   .chain .iteration .draw b_TREATALG2 b_TREATNB b_TREATS
    <int>      <int> <int>       <dbl>     <dbl>    <dbl>
 1      1          1     1       0.256    -0.359   -0.497
 2      1          2     2       0.301    -0.196   -0.227
 3      1          3     3       0.353    -0.437   -0.511
 4      1          4     4       0.165    -0.481   -0.693
 5      1          5     5       0.201    -0.596   -0.238
 6      1          6     6       0.135    -0.520   -0.820
 7      1          7     7       0.444    -0.368   -0.736
 8      1          8     8       0.213    -0.341   -0.609
 9      1          9     9       0.295    -0.405   -0.453
10      1         10    10       0.142    -0.514   -0.538
# ℹ 2,390 more rows
day.brm3 |>
  posterior_samples() |>
  as_tibble()
Warning: Method 'posterior_samples' is deprecated. Please see ?as_draws for
recommended alternatives.
# A tibble: 2,400 × 9
   b_Intercept b_TREATALG2 b_TREATNB b_TREATS Intercept prior_Intercept prior_b
         <dbl>       <dbl>     <dbl>    <dbl>     <dbl>           <dbl>   <dbl>
 1        3.15       0.256    -0.359   -0.497      3.00            1.14   0.976
 2        3.02       0.301    -0.196   -0.227      2.99            3.60   1.68 
 3        3.08       0.353    -0.437   -0.511      2.93            1.86  -2.24 
 4        3.26       0.165    -0.481   -0.693      3.00            4.19   1.55 
 5        3.17       0.201    -0.596   -0.238      3.02            1.44  -1.28 
 6        3.19       0.135    -0.520   -0.820      2.89            4.74   5.52 
 7        3.01       0.444    -0.368   -0.736      2.84            1.37   1.55 
 8        3.19       0.213    -0.341   -0.609      3.01            5.06  -1.30 
 9        3.18       0.295    -0.405   -0.453      3.04            1.07  -1.32 
10        3.22       0.142    -0.514   -0.538      3.00            6.50   2.64 
# ℹ 2,390 more rows
# ℹ 2 more variables: lprior <dbl>, lp__ <dbl>
day.brm3 |>
  bayes_R2(summary = FALSE) |>
  median_hdci()
          y     ymin      ymax .width .point .interval
1 0.6947422 0.504946 0.7749342   0.95 median      hdci

Region of Practical Equivalence

0.1 * sd(log(day$BARNACLE))
[1] 0.04111272
day.brm3 |> rope(range = c(-0.04, 0.04))
# Proportion of samples inside the ROPE [-0.04, 0.04]:

Parameter | inside ROPE
-----------------------
Intercept |      0.00 %
TREATALG2 |      3.64 %
TREATNB   |      0.00 %
TREATS    |      0.00 %
rope(day.brm3, range = c(-0.04, 0.04)) |> plot()

## Or based on fractional scale
day.brm3 |>
  as_draws_df("^b_TREAT.*", regex = TRUE) |>
  exp() |>
  ## equivalence_test(range = c(0.9, 1.1))
  rope(range = c(0.9, 1.1))
# Proportion of samples inside the ROPE [0.90, 1.10]:

Parameter | inside ROPE
-----------------------
TREATALG2 |     10.83 %
TREATNB   |      0.75 %
TREATS    |      0.00 %
day.mcmc <-
  day.brm3 |>
  as_draws_df("^b_TREAT.*", regex = TRUE) |>
  exp()
day.mcmc |>
  rope(range = c(0.9, 1.1))
# Proportion of samples inside the ROPE [0.90, 1.10]:

Parameter | inside ROPE
-----------------------
TREATALG2 |     10.83 %
TREATNB   |      0.75 %
TREATS    |      0.00 %
day.mcmc |>
  rope(range = c(0.9, 1.1)) |>
  plot(day.mcmc)

day.mcmc |>
  equivalence_test(range = c(0.9, 1.1))
# Test for Practical Equivalence

  ROPE: [0.90 1.10]

Parameter |        H0 | inside ROPE |      95% HDI
--------------------------------------------------
TREATALG2 | Undecided |     10.83 % | [0.99, 1.63]
TREATNB   | Undecided |      0.75 % | [0.50, 0.91]
TREATS    |  Rejected |      0.00 % | [0.44, 0.81]

10 Further investigations

day.rstanarm3 |> emmeans(pairwise ~ TREAT, type = "response")
$emmeans
 TREAT rate lower.HPD upper.HPD
 ALG1  22.1      18.3      26.3
 ALG2  28.3      23.8      32.9
 NB    15.0      12.0      18.6
 S     13.2      10.4      16.5

Point estimate displayed: median 
Results are back-transformed from the log scale 
HPD interval probability: 0.95 

$contrasts
 contrast    ratio lower.HPD upper.HPD
 ALG1 / ALG2 0.783     0.615     0.986
 ALG1 / NB   1.483     1.112     1.958
 ALG1 / S    1.677     1.245     2.259
 ALG2 / NB   1.891     1.426     2.466
 ALG2 / S    2.134     1.558     2.797
 NB / S      1.134     0.803     1.542

Point estimate displayed: median 
Results are back-transformed from the log scale 
HPD interval probability: 0.95 

Conclusions:

  • the contrasts section of the output indicates that there is evidence of:
  • ALG1 has 1.48 fold (48%) more newly recruited barnacles than the NB substrate
  • ALG1 has 1.68 fold (68%) more newly recruited barnacles than the S substrate
  • ALG2 has 1.89 fold (89%) more newly recruited barnacles than the NB substrate
  • ALG2 has 2.13 fold (113%) more newly recruited barnacles than the S substrate
  • ALG1 was not found to be different to ALG2 and NB was not found to be different to S
day.em <- emmeans(day.rstanarm3, pairwise ~ TREAT, type = "link")$contrasts |>
  gather_emmeans_draws() |>
  mutate(Fit = exp(.value))
day.em |> head()
# A tibble: 6 × 6
# Groups:   contrast [1]
  contrast    .chain .iteration .draw .value   Fit
  <chr>        <int>      <int> <int>  <dbl> <dbl>
1 ALG1 - ALG2     NA         NA     1 -0.187 0.830
2 ALG1 - ALG2     NA         NA     2 -0.428 0.652
3 ALG1 - ALG2     NA         NA     3 -0.340 0.712
4 ALG1 - ALG2     NA         NA     4 -0.221 0.802
5 ALG1 - ALG2     NA         NA     5 -0.313 0.731
6 ALG1 - ALG2     NA         NA     6 -0.322 0.725
day.em |>
  group_by(contrast) |>
  ggplot(aes(x = Fit)) +
  geom_histogram() +
  geom_vline(xintercept = 1, color = "red") +
  facet_wrap(~contrast, scales = "free")
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

day.em |>
  group_by(contrast) |>
  ggplot(aes(x = Fit)) +
  geom_density_ridges_gradient(aes(y = contrast, fill = stat(x)),
    alpha = 0.4, color = "white",
    quantile_lines = TRUE,
    quantiles = c(0.025, 0.975)
  ) +
  geom_vline(xintercept = 1, linetype = "dashed") +
  scale_fill_viridis_c(option = "C") +
  scale_x_continuous(trans = scales::log2_trans())
Picking joint bandwidth of 0.0396

day.em |>
  ggplot(aes(x = Fit)) +
  geom_density_ridges_gradient(
    aes(
      y = contrast,
      fill = factor(stat(x > 0))
    ),
    alpha = 0.4, colour = "white",
    quantile_lines = TRUE,
    quantiles = c(0.025, 0.975)
  ) +
  geom_vline(xintercept = 1, linetype = "dashed") +
  scale_x_continuous(trans = scales::log2_trans()) +
  scale_fill_viridis_d()
Picking joint bandwidth of 0.0396

day.em |>
  group_by(contrast) |>
  median_hdi()
# A tibble: 6 × 10
  contrast    .value .value.lower .value.upper   Fit Fit.lower Fit.upper .width
  <chr>        <dbl>        <dbl>        <dbl> <dbl>     <dbl>     <dbl>  <dbl>
1 ALG1 - ALG2 -0.244       -0.482     -0.00934 0.783     0.615     0.986   0.95
2 ALG1 - NB    0.394        0.121      0.684   1.48      1.09      1.94    0.95
3 ALG1 - S     0.517        0.241      0.833   1.68      1.24      2.26    0.95
4 ALG2 - NB    0.637        0.375      0.918   1.89      1.40      2.44    0.95
5 ALG2 - S     0.758        0.497      1.07    2.13      1.56      2.80    0.95
6 NB - S       0.125       -0.173      0.468   1.13      0.806     1.55    0.95
# ℹ 2 more variables: .point <chr>, .interval <chr>
# Probability of effect
day.em |>
  group_by(contrast) |>
  summarize(P = mean(Fit > 1))
# A tibble: 6 × 2
  contrast         P
  <chr>        <dbl>
1 ALG1 - ALG2 0.0221
2 ALG1 - NB   0.998 
3 ALG1 - S    1.000 
4 ALG2 - NB   1     
5 ALG2 - S    1     
6 NB - S      0.782 
day.em |>
  group_by(contrast) |>
  summarize(P = mean(Fit < 1))
# A tibble: 6 × 2
  contrast           P
  <chr>          <dbl>
1 ALG1 - ALG2 0.978   
2 ALG1 - NB   0.00208 
3 ALG1 - S    0.000417
4 ALG2 - NB   0       
5 ALG2 - S    0       
6 NB - S      0.218   
## Probability of effect greater than 10%
day.em |>
  group_by(contrast) |>
  summarize(P = mean(Fit > 1.1))
# A tibble: 6 × 2
  contrast          P
  <chr>         <dbl>
1 ALG1 - ALG2 0.00417
2 ALG1 - NB   0.980  
3 ALG1 - S    0.998  
4 ALG2 - NB   1      
5 ALG2 - S    1      
6 NB - S      0.559  
day.brm3 |>
  emmeans(~TREAT, type = "response") |>
  pairs()
Warning: Method 'parse_bf' is deprecated. Please use 'brmsterms' instead.
 contrast    ratio lower.HPD upper.HPD
 ALG1 / ALG2 0.785     0.604      1.00
 ALG1 / NB   1.486     1.087      1.96
 ALG1 / S    1.694     1.208      2.25
 ALG2 / NB   1.890     1.418      2.47
 ALG2 / S    2.150     1.606      2.87
 NB / S      1.136     0.824      1.58

Point estimate displayed: median 
Results are back-transformed from the log scale 
HPD interval probability: 0.95 
# OR
day.pairwise <- day.brm3 |>
  emmeans(~TREAT, type = "response") |>
  pairs() |>
  as.data.frame()
Warning: Method 'parse_bf' is deprecated. Please use 'brmsterms' instead.
## OR
day.brm3 |>
  emmeans(~TREAT) |>
  pairs() |>
  tidy_draws() |>
  exp() |>
  summarise_draws(median)
Warning: Method 'parse_bf' is deprecated. Please use 'brmsterms' instead.
# A tibble: 6 × 2
  variable             median
  <chr>                 <dbl>
1 contrast ALG1 - ALG2  0.785
2 contrast ALG1 - NB    1.49 
3 contrast ALG1 - S     1.69 
4 contrast ALG2 - NB    1.89 
5 contrast ALG2 - S     2.15 
6 contrast NB - S       1.14 
## OR
day.brm3 |>
  emmeans(~TREAT) |>
  pairs() |>
  gather_emmeans_draws() |>
  mutate(.ratio = exp(.value)) |>
  ## median_hdci(.ratio)
  summarise(
    median_hdci(.ratio),
    P = mean(.ratio > 1),
    ROPE = rope(.ratio, range = c(0.9, 1.1))$ROPE_Percentage
  )
Warning: Method 'parse_bf' is deprecated. Please use 'brmsterms' instead.
# A tibble: 6 × 9
  contrast        y  ymin  ymax .width .point .interval      P  ROPE
  <chr>       <dbl> <dbl> <dbl>  <dbl> <chr>  <chr>      <dbl> <dbl>
1 ALG1 - ALG2 0.785 0.604 1.000   0.95 median hdci      0.0312 0.123
2 ALG1 - NB   1.49  1.06  1.94    0.95 median hdci      0.995  0    
3 ALG1 - S    1.69  1.20  2.24    0.95 median hdci      0.999  0    
4 ALG2 - NB   1.89  1.41  2.47    0.95 median hdci      1      0    
5 ALG2 - S    2.15  1.61  2.87    0.95 median hdci      1      0    
6 NB - S      1.14  0.806 1.56    0.95 median hdci      0.789  0.354
## summarise(across(c(.value, .ratio), c(median, HDInterval::hdi)))
## summarise(across(c(.value, .ratio), c(median_hdci)))

day.mcmc <-
  day.brm3 |>
  emmeans(~TREAT) |>
  pairs() |>
  tidy_draws() |>
  dplyr::select(-.chain, -.iteration, -.draw) |>
  exp()
Warning: Method 'parse_bf' is deprecated. Please use 'brmsterms' instead.
rope(day.mcmc, range = c(0.9, 1.1)) |> plot()

Conclusions:

  • the contrasts section of the output indicates that there is evidence of:
  • ALG1 has 1.49 fold (49%) more newly recruited barnacles than the NB substrate
  • ALG1 has 1.69 fold (69%) more newly recruited barnacles than the S substrate
  • ALG2 has 1.89 fold (89%) more newly recruited barnacles than the NB substrate
  • ALG2 has 2.15 fold (115%) more newly recruited barnacles than the S substrate
  • ALG1 was not found to be different to ALG2 and NB was not found to be different to S
day.brm3 |>
  emmeans(~TREAT, type = "response") |>
  pairs()
Warning: Method 'parse_bf' is deprecated. Please use 'brmsterms' instead.
 contrast    ratio lower.HPD upper.HPD
 ALG1 / ALG2 0.785     0.604      1.00
 ALG1 / NB   1.486     1.087      1.96
 ALG1 / S    1.694     1.208      2.25
 ALG2 / NB   1.890     1.418      2.47
 ALG2 / S    2.150     1.606      2.87
 NB / S      1.136     0.824      1.58

Point estimate displayed: median 
Results are back-transformed from the log scale 
HPD interval probability: 0.95 
day.em <- day.brm3 |>
  emmeans(~TREAT, type = "link") |>
  pairs() |>
  gather_emmeans_draws() |>
  mutate(Fit = exp(.value))
Warning: Method 'parse_bf' is deprecated. Please use 'brmsterms' instead.
day.em |>
  group_by(contrast) |>
  ggplot(aes(x = Fit)) +
  geom_density_ridges_gradient(aes(y = contrast, fill = stat(x)),
    alpha = 0.4, color = "white",
    quantile_lines = TRUE,
    quantiles = c(0.025, 0.975)
  ) +
  geom_vline(xintercept = 1, linetype = "dashed") +
  scale_fill_viridis_c(option = "C") +
  scale_x_continuous(trans = scales::log2_trans())
Picking joint bandwidth of 0.0401

day.em |>
  ggplot(aes(x = Fit)) +
  geom_density_ridges_gradient(
    aes(
      y = contrast,
      fill = factor(stat(x > 0))
    ),
    alpha = 0.4, colour = "white",
    quantile_lines = TRUE,
    quantiles = c(0.025, 0.975)
  ) +
  geom_vline(xintercept = 1, linetype = "dashed") +
  scale_x_continuous(trans = scales::log2_trans()) +
  scale_fill_viridis_d() +
  scale_x_continuous(trans = scales::log2_trans())
Scale for x is already present.
Adding another scale for x, which will replace the existing scale.
Picking joint bandwidth of 0.0401

day.em |> head()
# A tibble: 6 × 6
# Groups:   contrast [1]
  contrast    .chain .iteration .draw .value   Fit
  <chr>        <int>      <int> <int>  <dbl> <dbl>
1 ALG1 - ALG2     NA         NA     1 -0.256 0.774
2 ALG1 - ALG2     NA         NA     2 -0.301 0.740
3 ALG1 - ALG2     NA         NA     3 -0.353 0.702
4 ALG1 - ALG2     NA         NA     4 -0.165 0.848
5 ALG1 - ALG2     NA         NA     5 -0.201 0.818
6 ALG1 - ALG2     NA         NA     6 -0.135 0.874
day.em |>
  group_by(contrast) |>
  ggplot(aes(x = Fit)) +
  ## geom_histogram() +
  geom_halfeyeh() +
  geom_vline(xintercept = 1, color = "red") +
  facet_wrap(~contrast, scales = "free")
Warning: 'geom_halfeyeh' is deprecated.
Use 'stat_halfeye' instead.
See help("Deprecated") and help("tidybayes-deprecated").

day.em |>
  group_by(contrast) |>
  median_hdi()
# A tibble: 6 × 10
  contrast    .value .value.lower .value.upper   Fit Fit.lower Fit.upper .width
  <chr>        <dbl>        <dbl>        <dbl> <dbl>     <dbl>     <dbl>  <dbl>
1 ALG1 - ALG2 -0.242      -0.491        0.0129 0.785     0.604     1.000   0.95
2 ALG1 - NB    0.396       0.0962       0.684  1.49      1.06      1.94    0.95
3 ALG1 - S     0.527       0.195        0.813  1.69      1.20      2.24    0.95
4 ALG2 - NB    0.636       0.349        0.906  1.89      1.41      2.47    0.95
5 ALG2 - S     0.766       0.480        1.06   2.15      1.61      2.87    0.95
6 NB - S       0.128      -0.180        0.473  1.14      0.806     1.56    0.95
# ℹ 2 more variables: .point <chr>, .interval <chr>
# Probability of effect
day.em |>
  group_by(contrast) |>
  summarize(P = mean(Fit > 1))
# A tibble: 6 × 2
  contrast         P
  <chr>        <dbl>
1 ALG1 - ALG2 0.0312
2 ALG1 - NB   0.995 
3 ALG1 - S    0.999 
4 ALG2 - NB   1     
5 ALG2 - S    1     
6 NB - S      0.789 
## Probability of effect greater than 10%
day.em |>
  group_by(contrast) |>
  summarize(P = mean(Fit > 1.1))
# A tibble: 6 × 2
  contrast          P
  <chr>         <dbl>
1 ALG1 - ALG2 0.00375
2 ALG1 - NB   0.975  
3 ALG1 - S    0.997  
4 ALG2 - NB   1.000  
5 ALG2 - S    1      
6 NB - S      0.584  
## Effect size on absolute scale
day.em <- day.brm3 |>
  emmeans(~TREAT, type = "link") |>
  regrid() |>
  pairs() |>
  gather_emmeans_draws() |>
  median_hdci(.value)
Warning: Method 'parse_bf' is deprecated. Please use 'brmsterms' instead.
day.brm3 |>
  emmeans(~TREAT, type = "link") |>
  regrid() |>
  pairs() |>
  gather_emmeans_draws() |>
  group_by(contrast) |>
  ggplot(aes(x = .value)) +
  geom_halfeyeh() +
  geom_vline(xintercept = 0, color = "red") +
  facet_wrap(~contrast, scales = "free")
Warning: Method 'parse_bf' is deprecated. Please use 'brmsterms' instead.
Warning: 'geom_halfeyeh' is deprecated.
Use 'stat_halfeye' instead.
See help("Deprecated") and help("tidybayes-deprecated").

day.brm3 |>
  emmeans(~TREAT, type = "response") |>
  pairs() |>
  gather_emmeans_draws() |>
  mutate(.value = exp(.value)) |>
  ggplot(aes(x = .value)) +
  geom_density_ridges_gradient(
    aes(
      y = contrast,
      fill = factor(stat(x > 0))
    ),
    alpha = 0.4, colour = "white",
    quantile_lines = TRUE,
    quantiles = c(0.025, 0.975)
  ) +
  geom_vline(xintercept = 1, linetype = "dashed") +
  scale_x_continuous(trans = scales::log2_trans()) +
  scale_fill_viridis_d()
Warning: Method 'parse_bf' is deprecated. Please use 'brmsterms' instead.
Picking joint bandwidth of 0.0401

Define your own

Compare:

  1. ALG1 vs ALG2
  2. NB vs S
  3. average of ALG1+ALG2 vs NB+S
Levels Alg1 vs Alg2 NB vs S Alg vs Bare
Alg1 1 0 0.5
Alg2 -1 0 0.5
NB 0 1 -0.5
S 0 -1 -0.5
## Planned contrasts
cmat <- cbind(
  "Alg2_Alg1" = c(-1, 1, 0, 0),
  "NB_S" = c(0, 0, 1, -1),
  "Alg_Bare" = c(0.5, 0.5, -0.5, -0.5),
  "Alg_NB" = c(0.5, 0.5, -1, 0)
)
# On the link scale
emmeans(day.rstanarm3, ~TREAT, contr = list(TREAT = cmat), type = "link")
$emmeans
 TREAT emmean lower.HPD upper.HPD
 ALG1    3.10      2.91      3.28
 ALG2    3.34      3.19      3.51
 NB      2.71      2.50      2.93
 S       2.58      2.34      2.80

Point estimate displayed: median 
Results are given on the log (not the response) scale. 
HPD interval probability: 0.95 

$contrasts
 contrast        estimate lower.HPD upper.HPD
 TREAT.Alg2_Alg1    0.244    0.0142     0.486
 TREAT.NB_S         0.125   -0.1732     0.468
 TREAT.Alg_Bare     0.577    0.3826     0.783
 TREAT.Alg_NB       0.516    0.2729     0.770

Point estimate displayed: median 
Results are given on the log (not the response) scale. 
HPD interval probability: 0.95 
# On the response scale
emmeans(day.rstanarm3, ~TREAT, contr = list(TREAT = cmat), type = "response")
$emmeans
 TREAT rate lower.HPD upper.HPD
 ALG1  22.1      18.3      26.3
 ALG2  28.3      23.8      32.9
 NB    15.0      12.0      18.6
 S     13.2      10.4      16.5

Point estimate displayed: median 
Results are back-transformed from the log scale 
HPD interval probability: 0.95 

$contrasts
 contrast        ratio lower.HPD upper.HPD
 TREAT.Alg2_Alg1  1.28     0.998      1.60
 TREAT.NB_S       1.13     0.803      1.54
 TREAT.Alg_Bare   1.78     1.449      2.16
 TREAT.Alg_NB     1.67     1.283      2.11

Point estimate displayed: median 
Results are back-transformed from the log scale 
HPD interval probability: 0.95 
day.em <- emmeans(day.rstanarm3, ~TREAT, contr = list(TREAT = cmat), type = "link")$contrasts |>
  gather_emmeans_draws() |>
  mutate(Fit = exp(.value))
day.em |>
  group_by(contrast) |>
  mean_hdi()
# A tibble: 4 × 10
  contrast     .value .value.lower .value.upper   Fit Fit.lower Fit.upper .width
  <chr>         <dbl>        <dbl>        <dbl> <dbl>     <dbl>     <dbl>  <dbl>
1 TREAT.Alg_B…  0.577      0.383          0.783  1.79     1.44       2.15   0.95
2 TREAT.Alg_NB  0.515      0.255          0.752  1.69     1.28       2.11   0.95
3 TREAT.Alg2_…  0.245      0.00934        0.482  1.29     0.998      1.60   0.95
4 TREAT.NB_S    0.126     -0.173          0.468  1.15     0.806      1.55   0.95
# ℹ 2 more variables: .point <chr>, .interval <chr>
# Probability of effect
day.em |>
  group_by(contrast) |>
  summarize(P = mean(Fit > 1))
# A tibble: 4 × 2
  contrast            P
  <chr>           <dbl>
1 TREAT.Alg2_Alg1 0.978
2 TREAT.Alg_Bare  1    
3 TREAT.Alg_NB    1    
4 TREAT.NB_S      0.782
## Probability of effect greater than 10%
day.em |>
  group_by(contrast) |>
  summarize(P = mean(Fit > 1.5))
# A tibble: 4 × 2
  contrast             P
  <chr>            <dbl>
1 TREAT.Alg2_Alg1 0.0946
2 TREAT.Alg_Bare  0.956 
3 TREAT.Alg_NB    0.795 
4 TREAT.NB_S      0.0488
day.sum <- day.em |>
  group_by(contrast) |>
  median_hdci(.width = c(0.8, 0.95))
day.sum
# A tibble: 8 × 10
  contrast     .value .value.lower .value.upper   Fit Fit.lower Fit.upper .width
  <chr>         <dbl>        <dbl>        <dbl> <dbl>     <dbl>     <dbl>  <dbl>
1 TREAT.Alg_B…  0.577      0.439          0.707  1.78     1.50       1.98   0.8 
2 TREAT.Alg_NB  0.516      0.346          0.680  1.67     1.39       1.95   0.8 
3 TREAT.Alg2_…  0.244      0.0964         0.409  1.28     1.06       1.46   0.8 
4 TREAT.NB_S    0.125     -0.0641         0.364  1.13     0.879      1.35   0.8 
5 TREAT.Alg_B…  0.577      0.383          0.783  1.78     1.44       2.15   0.95
6 TREAT.Alg_NB  0.516      0.255          0.752  1.67     1.28       2.11   0.95
7 TREAT.Alg2_…  0.244      0.00934        0.482  1.28     0.998      1.60   0.95
8 TREAT.NB_S    0.125     -0.173          0.468  1.13     0.806      1.55   0.95
# ℹ 2 more variables: .point <chr>, .interval <chr>
ggplot(day.sum) +
  geom_hline(yintercept = 1, linetype = "dashed") +
  geom_pointrange(aes(x = contrast, y = Fit, ymin = Fit.lower, ymax = Fit.upper, size = factor(.width)),
    show.legend = FALSE
  ) +
  scale_size_manual(values = c(1, 0.5)) +
  coord_flip()

g1 <- ggplot(day.sum) +
  geom_hline(yintercept = 1) +
  geom_pointrange(aes(x = contrast, y = Fit, ymin = Fit.lower, ymax = Fit.upper, size = factor(.width)), show.legend = FALSE) +
  scale_size_manual(values = c(1, 0.5)) +
  scale_y_continuous(trans = scales::log2_trans(), breaks = c(0.5, 1, 2, 4)) +
  coord_flip() +
  theme_classic()
g1

## Planned contrasts
cmat <- cbind(
  "Alg2_Alg1" = c(-1, 1, 0, 0),
  "NB_S" = c(0, 0, 1, -1),
  "Alg_Bare" = c(0.5, 0.5, -0.5, -0.5),
  "Alg_NB" = c(0.5, 0.5, -1, 0)
)
# On the link scale
day.brm3 |>
  emmeans(~TREAT, type = "link") |>
  contrast(method = list(TREAT = cmat))
Warning: Method 'parse_bf' is deprecated. Please use 'brmsterms' instead.
 contrast        estimate lower.HPD upper.HPD
 TREAT.Alg2_Alg1    0.242   -0.0154     0.486
 TREAT.NB_S         0.128   -0.1932     0.459
 TREAT.Alg_Bare     0.585    0.3804     0.788
 TREAT.Alg_NB       0.515    0.2323     0.765

Point estimate displayed: median 
Results are given on the log (not the response) scale. 
HPD interval probability: 0.95 
# On the response scale
day.brm3 |>
  emmeans(~TREAT, type = "response") |>
  contrast(method = list(TREAT = cmat))
Warning: Method 'parse_bf' is deprecated. Please use 'brmsterms' instead.
 contrast        ratio lower.HPD upper.HPD
 TREAT.Alg2_Alg1  1.27     0.976      1.62
 TREAT.NB_S       1.14     0.824      1.58
 TREAT.Alg_Bare   1.79     1.450      2.18
 TREAT.Alg_NB     1.67     1.240      2.12

Point estimate displayed: median 
Results are back-transformed from the log scale 
HPD interval probability: 0.95 
day.em <- day.brm3 |>
  emmeans(~TREAT, type = "link") |>
  contrast(method = list(TREAT = cmat)) |>
  gather_emmeans_draws() |>
  mutate(Fit = exp(.value))
Warning: Method 'parse_bf' is deprecated. Please use 'brmsterms' instead.
day.em |> median_hdi(Fit)
# A tibble: 4 × 7
  contrast          Fit .lower .upper .width .point .interval
  <chr>           <dbl>  <dbl>  <dbl>  <dbl> <chr>  <chr>    
1 TREAT.Alg_Bare   1.79  1.45    2.18   0.95 median hdi      
2 TREAT.Alg_NB     1.67  1.24    2.11   0.95 median hdi      
3 TREAT.Alg2_Alg1  1.27  0.978   1.62   0.95 median hdi      
4 TREAT.NB_S       1.14  0.806   1.56   0.95 median hdi      
# Probability of effect
day.em |> summarize(P = mean(Fit > 1))
# A tibble: 4 × 2
  contrast            P
  <chr>           <dbl>
1 TREAT.Alg2_Alg1 0.969
2 TREAT.Alg_Bare  1    
3 TREAT.Alg_NB    1    
4 TREAT.NB_S      0.789
## Probability of effect greater than 50%
day.em |> summarize(P = mean(Fit > 1.5))
# A tibble: 4 × 2
  contrast             P
  <chr>            <dbl>
1 TREAT.Alg2_Alg1 0.0946
2 TREAT.Alg_Bare  0.947 
3 TREAT.Alg_NB    0.801 
4 TREAT.NB_S      0.0517
day.sum <- day.em |>
  group_by(contrast) |>
  median_hdci(.value, .width = c(0.8, 0.95))
day.sum
# A tibble: 8 × 7
  contrast        .value  .lower .upper .width .point .interval
  <chr>            <dbl>   <dbl>  <dbl>  <dbl> <chr>  <chr>    
1 TREAT.Alg_Bare   0.585  0.457   0.725   0.8  median hdci     
2 TREAT.Alg_NB     0.515  0.354   0.685   0.8  median hdci     
3 TREAT.Alg2_Alg1  0.242  0.0736  0.400   0.8  median hdci     
4 TREAT.NB_S       0.128 -0.0948  0.338   0.8  median hdci     
5 TREAT.Alg_Bare   0.585  0.380   0.788   0.95 median hdci     
6 TREAT.Alg_NB     0.515  0.238   0.772   0.95 median hdci     
7 TREAT.Alg2_Alg1  0.242 -0.0129  0.491   0.95 median hdci     
8 TREAT.NB_S       0.128 -0.180   0.473   0.95 median hdci     
g1 <- ggplot(day.sum) +
  geom_vline(xintercept = 0, linetype = "dashed") +
  geom_pointrange(
    aes(
      y = contrast, x = .value, xmin = .lower, xmax = .upper,
      size = factor(.width)
    ),
    show.legend = FALSE
  ) +
  scale_size_manual(values = c(1, 0.5))

day.sum <- day.em |>
  group_by(contrast) |>
  median_hdci(Fit, .width = c(0.8, 0.95))
day.sum
# A tibble: 8 × 7
  contrast          Fit .lower .upper .width .point .interval
  <chr>           <dbl>  <dbl>  <dbl>  <dbl> <chr>  <chr>    
1 TREAT.Alg_Bare   1.79  1.53    2.01   0.8  median hdci     
2 TREAT.Alg_NB     1.67  1.39    1.95   0.8  median hdci     
3 TREAT.Alg2_Alg1  1.27  1.05    1.46   0.8  median hdci     
4 TREAT.NB_S       1.14  0.889   1.38   0.8  median hdci     
5 TREAT.Alg_Bare   1.79  1.45    2.18   0.95 median hdci     
6 TREAT.Alg_NB     1.67  1.24    2.11   0.95 median hdci     
7 TREAT.Alg2_Alg1  1.27  0.978   1.62   0.95 median hdci     
8 TREAT.NB_S       1.14  0.806   1.56   0.95 median hdci     
g1 <- ggplot(day.sum) +
  geom_vline(xintercept = 1, linetype = "dashed") +
  geom_pointrange(aes(y = contrast, x = Fit, xmin = .lower, xmax = .upper, size = factor(.width)), show.legend = FALSE) +
  scale_size_manual(values = c(1, 0.5)) +
  scale_x_continuous(trans = scales::log2_trans(), breaks = c(0.5, 0.8, 1, 1.2, 1.5, 2, 4)) +
  theme_classic()
g1

g1a <-
  day.em |>
  ggplot() +
  geom_vline(xintercept = 1, linetype = "dashed") +
  # geom_vline(xintercept = 1.5, alpha=0.3, linetype = 'dashed') +
  stat_slab(aes(
    x = Fit, y = contrast,
    fill = stat(ggdist::cut_cdf_qi(cdf,
      .width = c(0.5, 0.8, 0.95),
      labels = scales::percent_format()
    ))
  ), color = "black") +
  scale_fill_brewer("Interval", direction = -1, na.translate = FALSE) +
  scale_x_continuous("Effect",
    trans = scales::log2_trans(),
    breaks = c(0.5, 0.8, 1, 1.2, 1.5, 2, 4)
  ) +
  scale_y_discrete("",
    breaks = c("TREAT.NB_S", "TREAT.Alg2_Alg1", "TREAT.Alg_NB", "TREAT.Alg_Bare"),
    labels = c("Nat. Bare vs Scapped", "Algae 1 vs 2", "Algae vs Nat. Bare", "Algae vs Bare")
  ) +
  theme_classic()
g1 + g1a

library(ggridges)
day.em |>
  ggplot() +
  geom_density_ridges(aes(x = Fit, y = contrast), alpha = 0.4) +
  geom_vline(xintercept = 1, linetype = "dashed")
Picking joint bandwidth of 0.0356

day.em |>
  ggplot() +
  geom_density_ridges_gradient(
    aes(
      x = Fit,
      y = contrast,
      fill = stat(x)
    ),
    alpha = 0.4, colour = "white",
    quantile_lines = TRUE,
    quantiles = c(0.025, 0.975)
  ) +
  geom_vline(xintercept = 1, linetype = "dashed") +
  scale_x_continuous(trans = scales::log2_trans()) +
  scale_fill_viridis_c(option = "C")
Picking joint bandwidth of 0.0358

day.em |>
  ggplot() +
  geom_density_ridges_gradient(
    aes(
      x = 100 * (Fit - 1),
      y = contrast,
      fill = stat(x)
    ),
    alpha = 0.4, colour = "white",
    quantile_lines = TRUE,
    quantiles = c(0.025, 0.975)
  ) +
  geom_vline(xintercept = 1, linetype = "dashed") +
  scale_x_continuous("Percentage change") +
  scale_fill_viridis_c(option = "C")
Picking joint bandwidth of 3.56

## Or on a fractional scale
day.brm3 |>
  gather_draws(`.*TREAT.*`, regex = TRUE) |>
  ggplot() +
  geom_density_ridges_gradient(
    aes(
      x = exp(.value),
      y = .variable,
      fill = stat(x)
    ),
    alpha = 0.4, colour = "white",
    quantile_lines = TRUE,
    quantiles = c(0.025, 0.975)
  ) +
  geom_vline(xintercept = 1, linetype = "dashed") +
  scale_x_continuous(trans = scales::log2_trans()) +
  scale_fill_viridis_c(option = "C")
Picking joint bandwidth of 0.0387

11 Summary Figure

newdata <- emmeans(day.rstanarm3, ~TREAT, type = "response") |>
  as.data.frame()
newdata
 TREAT     rate lower.HPD upper.HPD
 ALG1  22.11401  18.30339  26.34172
 ALG2  28.25812  23.81680  32.90147
 NB    15.00438  12.02859  18.59356
 S     13.22360  10.36410  16.49562

Point estimate displayed: median 
Results are back-transformed from the log scale 
HPD interval probability: 0.95 
## A quick version
g2 <- ggplot(newdata, aes(y = rate, x = TREAT)) +
  geom_pointrange(aes(ymin = lower.HPD, ymax = upper.HPD)) +
  theme_classic()

g2 + g1

newdata <- day.brm3 |>
  emmeans(~TREAT, type = "response") |>
  as.data.frame()
Warning: Method 'parse_bf' is deprecated. Please use 'brmsterms' instead.
newdata
 TREAT     rate lower.HPD upper.HPD
 ALG1  22.25853  18.34847  26.56840
 ALG2  28.34633  23.72703  33.16994
 NB    14.97637  11.83190  18.65041
 S     13.16806  10.37681  16.69148

Point estimate displayed: median 
Results are back-transformed from the log scale 
HPD interval probability: 0.95 
## A quick version
g2 <- ggplot(newdata, aes(y = rate, x = TREAT)) +
  geom_pointrange(aes(ymin = lower.HPD, ymax = upper.HPD)) +
  scale_y_continuous("Number of newly recruited barnacles") +
  scale_x_discrete("",
    breaks = c("ALG1", "ALG2", "NB", "S"),
    labels = c("Algae 1", "Algae 2", "Nat. Bare", "Scraped")
  ) +
  theme_classic()

g2 + g1

(g2 + ggtitle("a)")) + (g1a + ggtitle("b)"))

12 References

Quinn, G. P., and K. J. Keough. 2002. Experimental Design and Data Analysis for Biologists. London: Cambridge University Press.